The backend now boots an internal crawler daemon that runs a daily metadata pass (CRAWLER_DAILY_AT in CRAWLER_TZ, advisory-lock guarded for multi-replica safety) and drains SyncChapterContent jobs from crawler_jobs through a worker pool. Chromium launches lazily on first job and is torn down after CRAWLER_IDLE_TIMEOUT_S seconds of inactivity. Modules: - crawler::browser_manager — lazy-launch / idle-teardown wrapper around browser::Handle, with an on_launch hook that re-injects PHPSESSID on every fresh Chromium spawn. - crawler::pipeline — run_metadata_pass (the shared discover/upsert /cover/sync-chapters loop) and the enqueue_bookmarked_pending helper used by the cron tick. - crawler::daemon — cron task + worker pool, behind two trait seams (MetadataPass, ChapterDispatcher) so tests can inject stubs without standing up Chromium or a live source. Behavior: - CRAWLER_DAEMON=false skips daemon spawn entirely (default for tests). - Catch-up tick fires on startup if the last persisted slot was missed. - A SyncOutcome::SessionExpired sets a sticky AtomicBool; workers idle until operator restart with a refreshed PHPSESSID. - Worker dispatch wrapped in catch_unwind so a panicking handler marks the job failed instead of taking down the worker. - Migration 0015 adds a small crawler_state k-v table for the last_metadata_tick_at watermark. Dep additions: chrono-tz (IANA TZ parsing). CLI (bin/crawler) reuses pipeline::run_metadata_pass and now holds the browser via BrowserManager so the on_launch session injection flow stays in one place. Inline chapter-content sync semantics are unchanged — the queue is for the daemon, force-refetches and manual backfills still bypass it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
33 lines
1.1 KiB
Rust
33 lines
1.1 KiB
Rust
use std::net::SocketAddr;
|
|
use tracing_subscriber::EnvFilter;
|
|
|
|
#[tokio::main]
|
|
async fn main() -> anyhow::Result<()> {
|
|
dotenvy::dotenv().ok();
|
|
tracing_subscriber::fmt()
|
|
.with_env_filter(
|
|
EnvFilter::try_from_default_env().unwrap_or_else(|_| "info,mangalord=debug".into()),
|
|
)
|
|
.init();
|
|
|
|
let config = mangalord::config::Config::from_env()?;
|
|
let addr: SocketAddr = config.bind_address.parse()?;
|
|
let mangalord::app::AppHandle { router, daemon } = mangalord::app::build(config).await?;
|
|
|
|
tracing::info!(%addr, "mangalord listening");
|
|
let listener = tokio::net::TcpListener::bind(addr).await?;
|
|
axum::serve(listener, router)
|
|
.with_graceful_shutdown(async {
|
|
let _ = tokio::signal::ctrl_c().await;
|
|
tracing::info!("ctrl-c received; shutting down");
|
|
})
|
|
.await?;
|
|
|
|
// Drain background tasks (crawler daemon) before exiting so Chromium
|
|
// gets a clean shutdown rather than relying on kill-on-drop.
|
|
if let Some(d) = daemon {
|
|
d.shutdown().await;
|
|
}
|
|
Ok(())
|
|
}
|