fix(crawler): scope dispatch_target to live sources, newest first (0.36.4)
The chapter dispatcher's URL resolver had no dropped_at filter and no ORDER BY — a chapter whose only chapter_sources row had been soft- dropped was still dispatched against the stale URL, eating retry budget on guaranteed transients. With multiple live sources the LIMIT 1 winner was nondeterministic. Add `AND cs.dropped_at IS NULL` and `ORDER BY cs.last_seen_at DESC` to dispatch_target, bringing it in lockstep with the enqueue queries in pipeline.rs that already filter on dropped_at. Returns None when all sources are dropped — callers in daemon.rs already treat None as "ack the job, skip the work." Tests in tests/repo_chapter.rs cover the three branches (freshest live wins, dropped sources skipped, all-dropped returns None). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -118,10 +118,21 @@ pub async fn page_count(pool: &PgPool, id: Uuid) -> sqlx::Result<Option<i32>> {
|
||||
.await
|
||||
}
|
||||
|
||||
/// Look up the manga_id + most recent source_url for a chapter. Used
|
||||
/// by the daemon's chapter dispatcher to resolve the URL it needs to
|
||||
/// hand to `content::sync_chapter_content`. Returns `None` if the
|
||||
/// chapter (or its source row) is gone.
|
||||
/// Look up the manga_id + most recent live source_url for a chapter.
|
||||
/// Used by the daemon's chapter dispatcher to resolve the URL it needs
|
||||
/// to hand to `content::sync_chapter_content`.
|
||||
///
|
||||
/// Skips soft-dropped sources (`cs.dropped_at IS NOT NULL`) and breaks
|
||||
/// ties between multiple live sources by `last_seen_at DESC`, so the
|
||||
/// freshest still-attached URL wins. Returns `None` when the chapter
|
||||
/// is gone or all its source rows are dropped — callers in the
|
||||
/// dispatcher treat `None` as "ack the job, skip the work."
|
||||
///
|
||||
/// The enqueue queries (`pipeline::enqueue_bookmarked_pending` and
|
||||
/// `enqueue_pending_for_manga`) apply the same `dropped_at IS NULL`
|
||||
/// filter — this resolver stays in lockstep so a chapter that was
|
||||
/// dropped between enqueue and lease isn't dispatched against a stale
|
||||
/// URL.
|
||||
pub async fn dispatch_target(
|
||||
pool: &PgPool,
|
||||
chapter_id: Uuid,
|
||||
@@ -131,6 +142,8 @@ pub async fn dispatch_target(
|
||||
FROM chapters c \
|
||||
JOIN chapter_sources cs ON cs.chapter_id = c.id \
|
||||
WHERE c.id = $1 \
|
||||
AND cs.dropped_at IS NULL \
|
||||
ORDER BY cs.last_seen_at DESC \
|
||||
LIMIT 1",
|
||||
)
|
||||
.bind(chapter_id)
|
||||
|
||||
Reference in New Issue
Block a user