fix(crawler): wait for page marker instead of fixed 1s sleep (0.36.2)
A chromium snapshot taken between the wrapper-render and row-render phases let parse_chapter_list return Ok(vec![]) for a manga that actually has chapters — the soft-drop branch in sync_manga_chapters then flipped every existing chapter to dropped_at. Add wait_for_selector to crawler::nav. navigate() now takes a CSS marker matching the most-specific element the downstream parser will look for (one of LIST_PAGE_MARKER / DETAIL_PAGE_CHAPTERS_MARKER / DETAIL_PAGE_LAYOUT_MARKER). The wait is best-effort and capped by SELECTOR_TIMEOUT (10s); a legitimately empty page can still pass through because the parser's #chapter_table sentinel and the universal broken-page body check stay in force. Same pattern wired at the reader nav (a#pic_container) and probe nav (#logo), replacing the implicit assumption that the post-load JS had finished within 1 second. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -206,6 +206,15 @@ async fn fetch_probe_html(browser: &Browser, probe_url: &str) -> anyhow::Result<
|
||||
crate::crawler::nav::wait_for_nav(&page)
|
||||
.await
|
||||
.context("wait for nav on probe")?;
|
||||
// Best-effort wait for the layout marker. Timeout is fine — the
|
||||
// probe classifier handles a missing `#logo` as Transient anyway,
|
||||
// and the verify loop retries on Transient.
|
||||
let _ = crate::crawler::nav::wait_for_selector(
|
||||
&page,
|
||||
"#logo",
|
||||
crate::crawler::nav::SELECTOR_TIMEOUT,
|
||||
)
|
||||
.await;
|
||||
let html = page.content().await.context("read probe html")?;
|
||||
page.close().await.ok();
|
||||
Ok(html)
|
||||
|
||||
Reference in New Issue
Block a user