Extends the live dashboard so an operator can see exactly what's being
fetched, in realtime:
- Chapters being crawled now are tracked in the status as `active_chapters`
(manga title · ch.N) with a live page counter that climbs per stored page
(set_chapter_pages, pushed via the existing watch→SSE). The dispatcher
registers each via an RAII ChapterGuard (sync Mutex) that removes the
entry on completion, panic, or timeout-drop — replacing the old per-worker
slot model.
- Covers: status now carries the cover being fetched now (`current_cover`,
set around download_and_store_cover in both the metadata pass and backfill)
and a `covers_queued` backlog count; CoverBackfill phase gains index/total.
- Two paginated backlog endpoints (fetched on demand, auto-refreshed when the
live counts change): GET /admin/crawler/active-jobs (which chapters of which
mangas are queued/running) and GET /admin/crawler/covers (mangas missing a
cover). repo: list_active_jobs, list_missing_cover_mangas, count_missing_covers.
- dispatch_target now also returns manga title + chapter number.
Frontend: the crawler page replaces the Workers table with an Active-chapters
table (live page bars), adds a current-cover line + covers-queued figure, and
two backlog sections (Queued chapters / Queued covers) with search + Pager,
auto-refetched via $effect on the live counts.
Tests: status guard/page + cover unit tests; repo list/count tests; endpoint
tests; frontend api tests. Version 0.53.1 -> 0.54.0.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the dashboard's 5s polling with a Server-Sent Events stream:
- StatusHandle gains a tokio `watch` version bumped on every mutation;
GET /admin/crawler/stream subscribes and pushes a composed snapshot
immediately on connect, then on every status change (instant, no
lost-wakeup) plus a 5s backstop for DB queue counts / browser phase.
- Non-status signals poke the notifier so they push immediately too:
session-expired (worker), session update / clear-expired / browser
restart (endpoints).
- compose_status is shared by the one-shot GET and the stream; the stream
tolerates transient DB errors with a keep-alive comment instead of
tearing down.
Frontend: the crawler page opens an EventSource on mount and closes it on
destroy, so the subscription is scoped to the active page (no global
subscription). A one-shot fetch still paints initial state / serves as a
fallback if SSE is blocked; a live/reconnecting indicator reflects the
connection. The existing reverse proxy already streams SSE (its abort
timer is cleared once response headers arrive), so no proxy change needed.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- requeue_dead_jobs: when a chapter has multiple dead jobs, revive only the
newest (DISTINCT ON the chapter key) so a single UPDATE can't flip two
dead rows for one chapter to pending and violate the partial unique dedup
index (was a 500 that requeued nothing). Non-chapter jobs fall back to row
id. Regression test added. (critical)
- coordinated_restart: a caller that coalesces into an in-progress restart
now reports that restart's real outcome instead of a blind success, so the
session-update "valid" / restart "ok" signal can't be falsely positive.
- SessionController::update: reject control chars / ';' / ',' in PHPSESSID
before it reaches the cookie string + CDP cookie. Test added.
- Add non-admin 403 test on a mutating crawler endpoint; fix stale
stream-to-storage doc comment.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
GET /admin/crawler live status (phase, workers,
last pass, session, browser, queue)
POST /admin/crawler/run trigger an out-of-cycle metadata pass
POST /admin/crawler/browser/restart coordinated Chromium restart
POST /admin/crawler/session refresh PHPSESSID + re-probe
POST /admin/crawler/session/clear-expired clear the sticky expired flag
GET /admin/crawler/dead-jobs paginated dead-letter list
POST /admin/crawler/dead-jobs/requeue requeue all / per-manga / single
All cookie-only via RequireAdmin; control endpoints 503 when the daemon is
disabled; mutations are audit-logged. Reads compose the live status with
DB-derived queue counts.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>