feat: crawler scaffold with chromium launcher (0.22.0)
- crawler module (browser, source trait, jobs, diff) + binary - chromiumoxide launcher with fetcher feature (auto-downloads Chromium on first run, caches under ~/.cache/mangalord/chromium) - LaunchOptions struct with extra_args, parseable from CRAWLER_BROWSER_MODE and CRAWLER_BROWSER_ARGS - migration 0012 introduces sources, manga_sources, chapter_sources, crawler_jobs - integration tests for headed + headless launch, ipify load+parse, and extra-args propagation (all #[ignore], opt-in)
This commit is contained in:
15
backend/src/crawler/diff.rs
Normal file
15
backend/src/crawler/diff.rs
Normal file
@@ -0,0 +1,15 @@
|
||||
//! Change-detection rules between the source and our DB.
|
||||
//!
|
||||
//! | Event | Signal |
|
||||
//! |--------------------|----------------------------------------------------------------------------------------|
|
||||
//! | New manga | `(source_id, source_manga_key)` not in `manga_sources` |
|
||||
//! | Updated metadata | freshly computed `metadata_hash` differs from the stored one |
|
||||
//! | Dropped manga | `last_seen_at < discover_run_started_at` for N consecutive successful discover runs |
|
||||
//! | New chapter | `(source_id, source_chapter_key)` not in `chapter_sources` |
|
||||
//! | Dropped chapter | present in DB but absent from the latest `fetch_chapter_list` for the same manga |
|
||||
//!
|
||||
//! Dropped is always a soft flag (`dropped_at`), never a row delete —
|
||||
//! restoring is a matter of clearing the flag if the source brings the
|
||||
//! item back.
|
||||
//!
|
||||
//! Scaffold only — implementations land once `repo::crawler` exists.
|
||||
Reference in New Issue
Block a user