feat(crawler): plumb TorController through FetchContext and pipelines
Adds CRAWLER_TOR_CONTROL_URL / _PASSWORD / _COOKIE_PATH / _RECIRCUIT_MAX_ATTEMPTS to CrawlerConfig and to bin/crawler.rs's env reads. Constructs an Option<Arc<TorController>> at daemon / CLI startup and threads it through FetchContext, pipeline::run_metadata_pass, and content::sync_chapter_content as Option<&TorController>. Pure scaffolding — the controller isn't used yet; behavior is unchanged. Next commit wires the retry hooks and session-probe recircuit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -67,6 +67,10 @@ pub struct SourceChapter {
|
||||
pub struct FetchContext<'a> {
|
||||
pub browser: &'a Browser,
|
||||
pub rate: &'a crate::crawler::rate_limit::HostRateLimiters,
|
||||
/// Optional TOR control-port client. When `Some`, retry helpers
|
||||
/// signal `NEWNYM` between transient-page attempts so the next try
|
||||
/// draws a fresh exit. `None` keeps pre-TOR behavior.
|
||||
pub tor: Option<&'a crate::crawler::tor::TorController>,
|
||||
}
|
||||
|
||||
/// Lazy iterator over discovered manga refs. The caller drives the
|
||||
|
||||
Reference in New Issue
Block a user