Mangalord

Author	SHA1	Message	Date
MechaCat02	33f7e19077	fix(crawler): serialize sync_manga_chapters per-manga (0.35.6) Two concurrent calls of sync_manga_chapters for the same manga both read seen_keys, both run the drop UPDATE filtered on `NOT (key = ANY $3)`, and the later commit can soft-drop a chapter the earlier had just inserted (lost-update under MVCC). Today the cron tick is the only caller and the daemon-level advisory lock keeps it single-flight, but that lock is held on one pool connection and doesn't actually serialize the function: any future caller (bookmark hook, admin-triggered re-sync, parallel worker) would race against the cron. Add `pg_advisory_xact_lock(hashtextextended(manga_id::text, 0))` at the start of the transaction. Auto-releases on commit/rollback so a panic mid-call can't strand the lock. Lock keyed per-manga so calls for different mangas still parallelize. Test sync_chapters_serializes_concurrent_calls_for_same_manga spawns two tokio tasks calling the function concurrently with overlapping chapter lists and asserts every chapter survives. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-29 20:45:01 +02:00
MechaCat02	c6bb9160e3	fix(crawler): scope chapter_sources lookup per-manga (0.35.5) chapter_sources's PRIMARY KEY was (source_id, source_chapter_key) and the lookup in sync_manga_chapters didn't constrain by manga_id, so a source whose chapter slugs aren't globally unique (e.g. "chapter-1" appearing under multiple mangas) silently attributed every collision to the first manga that synced it. The INSERT path would have conflicted on the second manga's sync. Migration 0017 drops the old PK and rekeys on (source_id, chapter_id) — the natural identity of a per-source chapter attachment — and adds an index on (source_id, source_chapter_key) for the lookup path. The repo lookup now joins chapters and filters by manga_id; the UPDATE path keys on chapter_id directly (the row's natural identifier post-migration). Test sync_chapters_isolates_colliding_keys_across_mangas pins the contract end-to-end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-29 20:43:08 +02:00
MechaCat02	8d34132883	bugfix: security & correctness bundle (0.34.1) Five fixes bundled into one release: - preserve user-attached tags across crawler upserts (repo::crawler::sync_tags now scopes to added_by IS NULL; orphaned attachments from deleted users are reaped as crawler-owned) - gate manga PATCH and cover endpoints on uploaded_by (require_can_edit in api::mangas; non-NULL uploaded_by must match the caller) - equalise login response time across user-existence branches (run argon2 against a OnceLock-cached dummy hash on the no-user branch so timing doesn't leak username existence) - crawler download defences (SSRF allowlist of host literals including IPv4-mapped IPv6 ranges, 32 MiB streamed size cap, reject non-whitelisted image types, three-way chapter-probe classifier replaces the binary #avatar_menu check) - tighten validation and clean up dead unload path (attach_tag + create_token enforce 64-char caps; LocalStorage rejects NUL bytes explicitly; reader flushFinalProgress drops the always-405 sendBeacon path) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-28 20:24:51 +02:00
MechaCat02	51346227dd	feat: route reader by chapter id, allow duplicate-numbered chapters (0.24.0) Real-world sources publish multiple chapters at the same number: different scanlators ("Ch.52 from bloomingdale" + "Ch.52 from mina"), translator notices and farewells, alt-translations. The (manga_id, number) UNIQUE constraint from 0001 silently collapsed all of those into a single row via the upsert path in repo::crawler. Migration 0013 drops the constraint; sync_manga_chapters now plain-INSERTs each SourceChapterRef so every parsed chapter survives as its own row. Identity moves from the (manga_id, number) tuple to the chapter UUID: - `GET /api/v1/mangas/:manga_id/chapters/:chapter_id` (replaces :number) - `GET /api/v1/mangas/:manga_id/chapters/:chapter_id/pages` - `repo::chapter::find_by_id_in_manga` (replaces find_by_manga_and_number) - Frontend reader route renamed to `/manga/[id]/chapter/[chapter_id]` - Chapter links throughout (manga page list, continue-reading CTA, reader prev/next, history rows, bookmark cards) use chapter.id - API clients getChapter/getChapterPages take a chapter id string read_progress + bookmarks already FK chapter_id; they only enrich with chapter_number for display, which is preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 23:37:07 +02:00
MechaCat02	b1a3a4e9d3	feat: crawler manga-list & metadata sync with cover download (0.23.0) - TargetSource: first concrete impl of the Source trait, modeled on the old Puppeteer crawler's selectors (+ status normalization, tag-count stripping, chapter list) - DiscoverMode::Backfill walks pagination last->1, reverse within each page (oldest-first); Incremental walks forward - RateLimiter (tokio-time aware) plumbed through FetchContext so the pagination walk honors the same per-host budget as the outer loop - repo::crawler: ensure_source, upsert_manga_from_source (returns New/Updated/Unchanged + current cover_image_path for backfill decisions), sync_manga_chapters, mark_dropped_mangas — all transactional, with case-insensitive lookups and source-insertable genres - Cover image download via reqwest+infer; stored under mangas/{id}/cover.{ext} via the Storage trait - Single CRAWLER_PROXY env wires both Chromium (--proxy-server) and reqwest::Proxy::all (HTTP/HTTPS/SOCKS5) - Crawler binary: positional start URL or $CRAWLER_START_URL, $CRAWLER_LIMIT (cap fetches + skip drop pass on partial runs), $CRAWLER_SKIP_CHAPTERS (disable selector AND sync), $CRAWLER_RATE_MS - Silences chromiumoxide 0.7's known CDP deserialize log spam via default tracing filter + CdpError::Serde downgrade - 9 sqlx integration tests + 11 selector/rate-limit unit tests	2026-05-21 22:04:23 +02:00

5 Commits