chore: dedupe is_unique_violation, lift SQL into repo, centralise URL parsing
Three layering cleanups from REVIEW.md §5 / §3:
- Drop the three private `is_unique_violation` helpers in
repo::{user,chapter,bookmark} in favour of sqlx 0.8's
`DatabaseError::is_unique_violation()` method (already used by
repo::collection).
- Remove the unreachable 23505 branch in repo::chapter::create — the
(manga_id, number) UNIQUE was dropped in 0013, so the defensive arm
could no longer fire. A doc note records what to do if uniqueness
is re-added.
- Move three inline SQL queries out of handlers/daemon into repo
functions: bookmarks' chapter-belongs-to-manga guard
(`repo::chapter::belongs_to_manga`), the daemon's dispatch lookup
(`repo::chapter::dispatch_target`), and the daemon's page_count
safety net (`repo::chapter::page_count`). Restores the
handlers→repo layering invariant in CLAUDE.md.
- New `crawler::url_utils` module consolidates host_of / origin_of /
registrable_domain — they used to live in three crawler submodules
with diverging edge-case behaviour. Tests moved with them.
- Doc cross-references on repo::author::set_for_manga and
repo::genre::set_for_manga pointing to the crawler's name-keyed
variants, so the intentional duplication is discoverable.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -98,15 +98,9 @@ impl HostRateLimiters {
|
||||
}
|
||||
}
|
||||
|
||||
/// Extract the host (no port) from a URL string. Returns `None` for
|
||||
/// inputs without a `scheme://host` shape — those would never have
|
||||
/// reached the network layer anyway.
|
||||
fn host_of(url: &str) -> Option<String> {
|
||||
let after_scheme = url.split_once("://")?.1;
|
||||
let host_with_port = after_scheme.split('/').next()?;
|
||||
let host = host_with_port.rsplit_once(':').map_or(host_with_port, |(h, _)| h);
|
||||
(!host.is_empty()).then(|| host.to_ascii_lowercase())
|
||||
}
|
||||
// `host_of` was duplicated across session/rate_limit/pipeline; the
|
||||
// canonical version now lives in `crawler::url_utils`.
|
||||
use crate::crawler::url_utils::host_of;
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
|
||||
Reference in New Issue
Block a user