Extends the live dashboard so an operator can see exactly what's being
fetched, in realtime:
- Chapters being crawled now are tracked in the status as `active_chapters`
(manga title · ch.N) with a live page counter that climbs per stored page
(set_chapter_pages, pushed via the existing watch→SSE). The dispatcher
registers each via an RAII ChapterGuard (sync Mutex) that removes the
entry on completion, panic, or timeout-drop — replacing the old per-worker
slot model.
- Covers: status now carries the cover being fetched now (`current_cover`,
set around download_and_store_cover in both the metadata pass and backfill)
and a `covers_queued` backlog count; CoverBackfill phase gains index/total.
- Two paginated backlog endpoints (fetched on demand, auto-refreshed when the
live counts change): GET /admin/crawler/active-jobs (which chapters of which
mangas are queued/running) and GET /admin/crawler/covers (mangas missing a
cover). repo: list_active_jobs, list_missing_cover_mangas, count_missing_covers.
- dispatch_target now also returns manga title + chapter number.
Frontend: the crawler page replaces the Workers table with an Active-chapters
table (live page bars), adds a current-cover line + covers-queued figure, and
two backlog sections (Queued chapters / Queued covers) with search + Pager,
auto-refetched via $effect on the live counts.
Tests: status guard/page + cover unit tests; repo list/count tests; endpoint
tests; frontend api tests. Version 0.53.1 -> 0.54.0.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- requeue_dead_jobs: when a chapter has multiple dead jobs, revive only the
newest (DISTINCT ON the chapter key) so a single UPDATE can't flip two
dead rows for one chapter to pending and violate the partial unique dedup
index (was a 500 that requeued nothing). Non-chapter jobs fall back to row
id. Regression test added. (critical)
- coordinated_restart: a caller that coalesces into an in-progress restart
now reports that restart's real outcome instead of a blind success, so the
session-update "valid" / restart "ok" signal can't be falsely positive.
- SessionController::update: reject control chars / ';' / ',' in PHPSESSID
before it reaches the cookie string + CDP cookie. Test added.
- Add non-admin 403 test on a mutating crawler endpoint; fix stale
stream-to-storage doc comment.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Lets the admin manga page requeue a single failed chapter's dead job(s)
inline, without a job id. Adds RequeueScope::Chapter + the matching
request variant and a repo test.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds the in-process observability + control infrastructure the admin
dashboard consumes:
- status.rs: CrawlerStatus/Phase/WorkerState + StatusHandle. The daemon
publishes its current phase (idle/walking/fetching-metadata/cover-backfill),
per-worker activity, and last-pass summary. Wired through the cron,
run_metadata_pass, and the worker loop.
- session_control.rs: SessionController refreshes PHPSESSID at runtime —
rewrites the shared reqwest cookie jar, updates the value on_launch reads,
persists to crawler_state (survives restart), and clears the expired flag.
on_launch now reads the live session instead of a startup snapshot.
- RealChapterDispatcher auto-triggers a coordinated browser restart after
CRAWLER_BROWSER_RESTART_THRESHOLD consecutive transient failures.
- repo::crawler: list_dead_jobs, requeue_dead_jobs (all/manga/job, bypassing
the quarantine, skipping live duplicates), job_state_counts.
- AppState gains CrawlerControl bundling browser_manager + session + status
+ metadata_pass for the admin endpoints.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>