feat: crawler scaffold with chromium launcher (0.22.0)
- crawler module (browser, source trait, jobs, diff) + binary - chromiumoxide launcher with fetcher feature (auto-downloads Chromium on first run, caches under ~/.cache/mangalord/chromium) - LaunchOptions struct with extra_args, parseable from CRAWLER_BROWSER_MODE and CRAWLER_BROWSER_ARGS - migration 0012 introduces sources, manga_sources, chapter_sources, crawler_jobs - integration tests for headed + headless launch, ipify load+parse, and extra-args propagation (all #[ignore], opt-in)
This commit is contained in:
19
backend/src/crawler/mod.rs
Normal file
19
backend/src/crawler/mod.rs
Normal file
@@ -0,0 +1,19 @@
|
||||
//! Crawler subsystem.
|
||||
//!
|
||||
//! Runs as its own binary (`src/bin/crawler.rs`) and shares `domain`,
|
||||
//! `repo`, and `storage` with the API binary. Layering mirrors the
|
||||
//! `Storage` trait pattern: callers depend on the `source::Source`
|
||||
//! trait, not on a concrete site; new sites plug in as additional
|
||||
//! impls without touching the job runner.
|
||||
//!
|
||||
//! Submodules:
|
||||
//! - [`browser`]: launches and pools Chromium via `chromiumoxide`.
|
||||
//! First run downloads a known-good build via the `fetcher` feature.
|
||||
//! - [`source`]: the `Source` trait. Per-site impls live alongside it.
|
||||
//! - [`jobs`]: job kinds, queue wrapper, handler dispatch.
|
||||
//! - [`diff`]: change detection — new / updated / dropped semantics.
|
||||
|
||||
pub mod browser;
|
||||
pub mod diff;
|
||||
pub mod jobs;
|
||||
pub mod source;
|
||||
Reference in New Issue
Block a user