A hung TLS handshake or a page that never fires load could wedge a
worker (or the cron metadata pass) indefinitely — chromiumoxide
imposes no navigation timeout of its own.
New crawler::nav::wait_for_nav caps each navigation at NAV_TIMEOUT
(30s) and returns a typed NavError so timeouts surface as transient
(retryable) errors. Wired at the three navigation sites:
- source::target::navigate (catalog/detail/pagination)
- content::sync_chapter_content (chapter reader)
- session::fetch_probe_html (session probe)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collapses the crawler to a single newest-first walker and replaces the
N-consecutive-unchanged streak with a per-manga rule: stop on the first
manga where metadata is Unchanged AND chapter sync reports zero new
chapters. The early stop is gated by a per-source recovery flag stored
in `crawler_state` — set to `false` when a run starts, back to `true`
only on a clean exit (end-of-walk or intentional stop). A crashed run
leaves the flag `false` automatically (no shutdown code runs), so the
next tick walks the full catalog instead of bailing at the first
caught-up manga.
This means a crashed mid-walk run self-heals on the next tick: the
flag stays `false`, the next walk visits every page (recovering
anything the crash missed past its crash point), and steady state
resumes once the recovery sweep reaches end-of-walk.
Removed:
- DiscoverMode enum, Backfill mode, the boundary re-check +
displaced-refs machinery in TargetSourceWalker.
- Drop-pass (mark_dropped_mangas) and seed-completion plumbing
(mark_seed_completed / seed_completed_at). The recovery flag
subsumes the seed-completion signal; drop detection was explicitly
opted out.
- JobPayload::Discover (no production callers).
- CRAWLER_MODE / CRAWLER_INCREMENTAL_STOP_AFTER env vars and the
CrawlerModePref config type.
`should_mark_clean_exit(walked_to_completion, hit_stop_condition)`
encodes the clean-exit truth table in its signature — `hit_limit` is
deliberately absent so a future edit cannot accidentally count a
caller-imposed cap as a clean exit.
Net -501 lines, 261 backend tests passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two concurrent calls of sync_manga_chapters for the same manga both
read seen_keys, both run the drop UPDATE filtered on `NOT (key = ANY
$3)`, and the later commit can soft-drop a chapter the earlier had
just inserted (lost-update under MVCC). Today the cron tick is the
only caller and the daemon-level advisory lock keeps it single-flight,
but that lock is held on one pool connection and doesn't actually
serialize the *function*: any future caller (bookmark hook,
admin-triggered re-sync, parallel worker) would race against the cron.
Add `pg_advisory_xact_lock(hashtextextended(manga_id::text, 0))` at
the start of the transaction. Auto-releases on commit/rollback so a
panic mid-call can't strand the lock. Lock keyed per-manga so calls
for different mangas still parallelize.
Test sync_chapters_serializes_concurrent_calls_for_same_manga spawns
two tokio tasks calling the function concurrently with overlapping
chapter lists and asserts every chapter survives.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
chapter_sources's PRIMARY KEY was (source_id, source_chapter_key) and
the lookup in sync_manga_chapters didn't constrain by manga_id, so a
source whose chapter slugs aren't globally unique (e.g. "chapter-1"
appearing under multiple mangas) silently attributed every collision
to the first manga that synced it. The INSERT path would have
conflicted on the second manga's sync.
Migration 0017 drops the old PK and rekeys on (source_id, chapter_id)
— the natural identity of a per-source chapter attachment — and adds
an index on (source_id, source_chapter_key) for the lookup path. The
repo lookup now joins chapters and filters by manga_id; the UPDATE
path keys on chapter_id directly (the row's natural identifier
post-migration).
Test sync_chapters_isolates_colliding_keys_across_mangas pins the
contract end-to-end.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The partial dedup index only blocks (pending|running) duplicates, so
once a SyncChapterContent job transitions to 'dead' (max_attempts
exhausted) the slot frees. Every subsequent cron tick re-enqueued the
chapter — page_count = 0 and dropped_at IS NULL stay true — burned
another max_attempts retries, and died again. Permanent-failure
chapters spun forever.
enqueue_bookmarked_pending and enqueue_pending_for_manga now skip
chapters whose latest sync_chapter_content job is dead within
CHAPTER_DEAD_QUARANTINE_DAYS (7). A failed chapter goes silent for a
week, then gets one more shot — long enough for a transient site
issue to resolve, short enough that permanent failures don't stay
permanent if conditions change.
Two integration tests pin both halves of the contract.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The three lease-ack functions matched their UPDATE on the job id
alone. If a lease expired and another worker re-leased the row, a
late ack from the original worker would clobber the new lease's
state, leased_until, and (for release) decrement its attempts.
Add `AND state = 'running'` to each UPDATE and log a warn when
rows_affected is zero, so a stolen lease shows up in telemetry without
blocking the new lease holder's progress.
Three new integration tests pin the contract:
- ack_done_no_ops_when_lease_was_stolen
- ack_failed_no_ops_when_state_is_not_running
- release_no_ops_when_state_is_not_running
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
parse_chapter_list previously returned Vec::new() on any selector
miss. The empty list flowed into sync_manga_chapters, whose soft-drop
branch then flipped every existing chapter's dropped_at to NOW().
Bookmarks subsequently pointed at dropped sources, and
enqueue_bookmarked_pending (filters on cs.dropped_at IS NULL) silently
stopped re-fetching pages.
Same shape as the walker race fixed in 0.35.1: a transient parse miss
masquerading as "source removed everything" → false soft-drop.
Fix: require #chapter_table in the DOM. Present-but-empty is preserved
as Ok(vec![]) so a freshly added manga with no published chapters
still parses cleanly. Absent table is now Transient — the job system
reschedules with backoff instead of treating the partial render as
data.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The target site orders by update_date DESC, and any new or updated
manga pushes everyone down by one slot. The paginated walker was
blind to this drift:
* Backfill (page last -> 1): shifts push items into pages already
finished. The displaced manga was silently missed; with
mark_dropped_mangas running on a fully-completed walk, items even
got false-dropped because last_seen_at was stale.
* Incremental (page 1 -> last): a shift causes the slot-last item
of an already-read page to reappear on the next page, leading to
a redundant fetch_manga and an inflated consecutive_unchanged
streak.
Fix is two-pronged:
1. Backfill boundary re-check. After fetching each page P, re-fetch
the previously-walked page P+1 and check where its old slot-0
key now sits. If it slid to slot K, the first K entries are
items that used to live on P and slid past us; they get appended
to the batch. If the anchor is gone entirely (multi-page shift
or it was bumped to page 1), the whole re-fetched page is
processed conservatively and the pipeline dedup absorbs the
noise. The re-check must be the *last* navigation of the
iteration to close the within-iteration race.
2. Run-scoped dedup in run_metadata_pass. A HashSet<String> of
source_manga_keys avoids double-processing. The set uses a
contains-then-insert pattern with insert firing *after* a
successful upsert, so a transient fetch/upsert failure leaves
the key retryable if it reappears later in the same pass (via
the boundary re-check or another batch).
Incremental mode does not run the re-check (shifts move in the
same direction as the walk); only the dedup helps it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three features bundled into one release:
- rate-limit /auth/login, /register, /me/password (token bucket,
5 req/sec sustained with 10-request burst by default; 429 +
Retry-After header on hit; tracing::warn! per hit so operators
see attack patterns; AUTH_RATE_PER_SEC / AUTH_RATE_BURST env knobs)
- handle SIGTERM for graceful container stops (replaces bare
ctrl_c() with a select over ctrl_c + SignalKind::terminate() so
docker compose stop runs the daemon shutdown path instead of
letting Chromium leak past SIGKILL)
- clear session.user on 401 from any API call (setOn401Hook in
api/client.ts, registered from session.svelte.ts gated on
$app/environment::browser so the SSR bundle never installs it;
fixes "logged in but no bookmarks/collections" mid-session
expiry state)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Five fixes bundled into one release:
- preserve user-attached tags across crawler upserts
(repo::crawler::sync_tags now scopes to added_by IS NULL; orphaned
attachments from deleted users are reaped as crawler-owned)
- gate manga PATCH and cover endpoints on uploaded_by (require_can_edit
in api::mangas; non-NULL uploaded_by must match the caller)
- equalise login response time across user-existence branches
(run argon2 against a OnceLock-cached dummy hash on the no-user
branch so timing doesn't leak username existence)
- crawler download defences (SSRF allowlist of host literals
including IPv4-mapped IPv6 ranges, 32 MiB streamed size cap,
reject non-whitelisted image types, three-way chapter-probe
classifier replaces the binary #avatar_menu check)
- tighten validation and clean up dead unload path
(attach_tag + create_token enforce 64-char caps; LocalStorage
rejects NUL bytes explicitly; reader flushFinalProgress drops
the always-405 sendBeacon path)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three layering cleanups from REVIEW.md §5 / §3:
- Drop the three private `is_unique_violation` helpers in
repo::{user,chapter,bookmark} in favour of sqlx 0.8's
`DatabaseError::is_unique_violation()` method (already used by
repo::collection).
- Remove the unreachable 23505 branch in repo::chapter::create — the
(manga_id, number) UNIQUE was dropped in 0013, so the defensive arm
could no longer fire. A doc note records what to do if uniqueness
is re-added.
- Move three inline SQL queries out of handlers/daemon into repo
functions: bookmarks' chapter-belongs-to-manga guard
(`repo::chapter::belongs_to_manga`), the daemon's dispatch lookup
(`repo::chapter::dispatch_target`), and the daemon's page_count
safety net (`repo::chapter::page_count`). Restores the
handlers→repo layering invariant in CLAUDE.md.
- New `crawler::url_utils` module consolidates host_of / origin_of /
registrable_domain — they used to live in three crawler submodules
with diverging edge-case behaviour. Tests moved with them.
- Doc cross-references on repo::author::set_for_manga and
repo::genre::set_for_manga pointing to the crawler's name-keyed
variants, so the intentional duplication is discoverable.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0012_crawler.sql's partial index on `state IN ('pending','failed')`
indexes a state that no code path ever writes — ack_failed in
crawler/jobs.rs only ever moves jobs to 'dead' or 'pending'. The
'failed' branch costs a write on every state change without ever
matching a query. Drop it; the CHECK still allows 'failed' so a
future migration can re-introduce it cleanly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Backend: new `app` user (UID 10001), STORAGE_DIR pre-chowned so the
named volume inherits ownership, curl installed for the HEALTHCHECK
that pings /api/v1/health. The crawler's Chromium uses --no-sandbox
already so dropping privileges costs nothing operationally.
Frontend: switch `npm install` to `npm ci` (matches CI; deterministic
versions; refuses to silently rewrite package-lock.json mid-build).
Run as the built-in `node` user via --chown=node:node, add a busybox
wget HEALTHCHECK on port 3000.
Both images now expose container-level health so orchestrators can
take a wedged container out of rotation instead of letting it keep
serving timeouts.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Daemon now auto-detects mode per source: Backfill until the first
full walk records `seed_completed:<source>` in `crawler_state`, then
Incremental (newest-first, stops after N consecutive Unchanged
upserts). `CRAWLER_MODE` overrides to a fixed mode; CLI rejects
`auto` since it has no pre-run DB state.
`Source::discover` returns a lazy `DiscoverWalk` so Incremental can
break out mid-walk without prefetching pages. The drop pass and seed
marker are now gated on a true full walk — fixes a latent soft-drop
of the index tail under partial sweeps.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
LaunchOptions::from_env() and LaunchOptions::default() now return
BrowserMode::Headless. The in-process daemon (via CrawlerConfig::from_env)
and the standalone crawler binary both pick this up — no display
required for production runs, smaller resource footprint.
`Headed` stays as an explicit opt-in via CRAWLER_BROWSER_MODE=headed
for debugging or sites that fingerprint headless Chrome. New unit test
locks the default in place.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds PUT /mangas/:id/cover (multipart) and DELETE /mangas/:id/cover so
covers can be replaced or cleared after creation, and wires a dedicated
/manga/[id]/edit SvelteKit route that combines the existing PATCH with
the new cover endpoints. Cover PUT cleans up the old blob when the
extension changes, swallowing StorageError::NotFound so a manually-gone
file doesn't surface as a 404 to the client. Edit link on the manga
detail page is gated on session.user, matching the auth posture of the
underlying handlers.
Also pins the local-dev port story via loadEnv() in vite.config.ts so
VITE_PORT / BACKEND_URL from a (gitignored) .env keep the dev URL
stable across runs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Until now, when the target site returned its 403 "we're sorry, the
request file are not found" response on a page that actually exists,
selectors matched nothing and the crawler treated the page as
"legitimately empty". Pagination walks silently dropped whole pages
worth of mangas, fetch_manga skipped individual entries, and the
startup session probe blamed PHPSESSID for what was a site hiccup.
This branch adds a single detection layer that the whole pipeline
routes through:
- `crawler::detect`: PageError::Transient typed signal, plus two
primitives (`is_broken_page_body` matches the universal 403 body;
`has_logo_sentinel` asserts #logo, the site-wide header element)
and a `retry_on_transient` helper that retries a closure on
Transient with a small attempt budget.
- `navigate()` screens every fetched body for the broken-page
signature before handing it to a selector.
- Parsers (`parse_manga_list_from`, `parse_manga_detail`,
`parse_chapter_pages`) check their structural sentinels (#logo for
full-layout pages; a#pic_container for the reader, which doesn't
render #logo) and return Result<_, PageError>. Empty Vec is now
reserved for genuinely empty pages.
- `discover()` retries each pagination page up to 3× (2s apart) before
failing the whole Discover job — at which point the existing job
system's retry/backoff takes over for longer outages.
- `verify_session` is three-state: broken-page → retry probe;
#logo present but #avatar_menu absent → genuine logout (the only
state that should blame PHPSESSID); both present → ok.
Test coverage added at the helper level: 13 unit tests for the
detection module (body signature, logo sentinel, PageError, retry
helper), parser-level tests for both transient and legitimately-empty
inputs, and 6 unit tests for the session probe classifier.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After a successful bookmark insert, the create handler spawns a
detached tokio task that calls pipeline::enqueue_pending_for_manga
for every chapter of the manga where page_count = 0 and the source
row is not dropped. Bookmark create returns 201 immediately; enqueue
work happens in the background and its failure is logged without
surfacing to the user (the daily cron sweeps anything missed).
The Phase A dedup index handles re-bookmarks idempotently — deleting
and recreating a bookmark does not duplicate in-flight jobs — and the
Phase B worker pool drains them.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The backend now boots an internal crawler daemon that runs a daily
metadata pass (CRAWLER_DAILY_AT in CRAWLER_TZ, advisory-lock guarded
for multi-replica safety) and drains SyncChapterContent jobs from
crawler_jobs through a worker pool. Chromium launches lazily on first
job and is torn down after CRAWLER_IDLE_TIMEOUT_S seconds of inactivity.
Modules:
- crawler::browser_manager — lazy-launch / idle-teardown wrapper
around browser::Handle, with an on_launch hook that re-injects
PHPSESSID on every fresh Chromium spawn.
- crawler::pipeline — run_metadata_pass (the shared discover/upsert
/cover/sync-chapters loop) and the enqueue_bookmarked_pending helper
used by the cron tick.
- crawler::daemon — cron task + worker pool, behind two trait seams
(MetadataPass, ChapterDispatcher) so tests can inject stubs without
standing up Chromium or a live source.
Behavior:
- CRAWLER_DAEMON=false skips daemon spawn entirely (default for tests).
- Catch-up tick fires on startup if the last persisted slot was missed.
- A SyncOutcome::SessionExpired sets a sticky AtomicBool; workers
idle until operator restart with a refreshed PHPSESSID.
- Worker dispatch wrapped in catch_unwind so a panicking handler
marks the job failed instead of taking down the worker.
- Migration 0015 adds a small crawler_state k-v table for the
last_metadata_tick_at watermark.
Dep additions: chrono-tz (IANA TZ parsing).
CLI (bin/crawler) reuses pipeline::run_metadata_pass and now holds
the browser via BrowserManager so the on_launch session injection
flow stays in one place. Inline chapter-content sync semantics are
unchanged — the queue is for the daemon, force-refetches and manual
backfills still bypass it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds enqueue / lease / ack_done / ack_failed / release / reap_done on
crawler::jobs, backed by the existing crawler_jobs table. lease() uses
a single FOR UPDATE SKIP LOCKED CTE that also re-claims stale running
rows (crashed-worker recovery), and ack_failed applies an exponential
backoff capped at 1h before retrying.
Migration 0014 adds a partial unique index on
(payload->>'chapter_id') restricted to (pending|running)
sync_chapter_content jobs, so producers can just
INSERT ... ON CONFLICT DO NOTHING without racing each other. The slot
frees again the moment the job leaves the in-flight states, so a
future force-refetch can re-enqueue.
Library-only — no daemon, no API hook. Those land in the next two
phases.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PG rejects `SELECT DISTINCT c.id, c.manga_id, cs.source_url ... ORDER BY
c.manga_id, c.created_at` because the ORDER BY references a column not in
the DISTINCT projection. Wrap the DISTINCT in a subquery (which includes
created_at) and apply the ORDER BY in the outer SELECT.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Debug aid: when set in headed mode, the crawler blocks on Ctrl+C at
every shutdown point (early auth bails + normal completion) instead
of closing the browser immediately. Operator can inspect DOM, cookies,
and network state in the visible Chromium window before exit.
Ignored in headless (no window to inspect) — logged as a warning if
set under headless so the operator doesn't sit waiting.
chromiumoxide's `Browser` is `kill_on_drop`, so the close-or-wait
helper must await Ctrl+C *before* the Handle is dropped — otherwise
the Chromium child gets killed out from under the operator.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After the metadata pass, the crawler now fetches per-chapter image
content for chapters belonging to bookmarked mangas. Logged-in chapter
pages render every page image at once (no per-page navigation), so the
crawler reuses the operator's browser session via a pasted PHPSESSID
cookie. Each chapter sync is a single transaction: storage puts + page
row inserts + page_count update commit together, or roll back together
on any image error so the chapter stays at page_count=0 and is retried
next run.
New crawler modules:
- `rate_limit::HostRateLimiters`: per-host buckets keyed by URL host,
with optional per-host overrides. Replaces the single shared
`Mutex<RateLimiter>`. Catalog and CDN no longer share a budget;
default 1 req/s per host.
- `session`: derives `.<registrable>.<tld>` from the start URL
(override via `CRAWLER_COOKIE_DOMAIN` for multi-part TLDs), injects
PHPSESSID into the Chromium cookie store, probes `#avatar_menu` at
startup to fail fast on a bad/expired cookie.
- `content`: parses `a#pic_container img:not(.loading)` with `pageN`
id-based sorting (DOM order isn't trusted), then performs the
atomic chapter sync.
bin/crawler additions:
- Concurrent chapter content phase via `futures_util::for_each_concurrent`
(`CRAWLER_CHAPTER_WORKERS`, default 1). Browser is borrowed across
workers — chromiumoxide allows concurrent `new_page` on `&self` —
and per-host rate limit gates total RPS regardless of worker count.
- reqwest gets the `cookies` feature, a `Jar` seeded with PHPSESSID
for the catalog domain only (CDN intentionally not given the
cookie), and `Referer` is set on cover + chapter image fetches.
- New env knobs: `CRAWLER_PHPSESSID`, `CRAWLER_COOKIE_DOMAIN`,
`CRAWLER_USER_AGENT`, `CRAWLER_CHAPTER_WORKERS`,
`CRAWLER_SKIP_CHAPTER_CONTENT`, `CRAWLER_FORCE_REFETCH_CHAPTERS`,
`CRAWLER_CDN_HOST` + `CRAWLER_CDN_RATE_MS`.
- Mid-run session-expired detection: `#avatar_menu` is re-checked on
every chapter page nav; first failure aborts the phase with a
cookie-refresh message.
Bookmark-driven enqueueing is sync-on-crawl-tick only: the bookmarked
chapters with `page_count = 0` are queried at the start of the
chapter-content phase. Sync-on-bookmark via an API hook is deferred
to a follow-up branch — that needs a daemon consumer of crawler_jobs,
which doesn't exist yet.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Real-world sources publish multiple chapters at the same number:
different scanlators ("Ch.52 from bloomingdale" + "Ch.52 from mina"),
translator notices and farewells, alt-translations. The (manga_id,
number) UNIQUE constraint from 0001 silently collapsed all of those
into a single row via the upsert path in repo::crawler. Migration 0013
drops the constraint; sync_manga_chapters now plain-INSERTs each
SourceChapterRef so every parsed chapter survives as its own row.
Identity moves from the (manga_id, number) tuple to the chapter UUID:
- `GET /api/v1/mangas/:manga_id/chapters/:chapter_id` (replaces :number)
- `GET /api/v1/mangas/:manga_id/chapters/:chapter_id/pages`
- `repo::chapter::find_by_id_in_manga` (replaces find_by_manga_and_number)
- Frontend reader route renamed to `/manga/[id]/chapter/[chapter_id]`
- Chapter links throughout (manga page list, continue-reading CTA,
reader prev/next, history rows, bookmark cards) use chapter.id
- API clients getChapter/getChapterPages take a chapter id string
read_progress + bookmarks already FK chapter_id; they only enrich with
chapter_number for display, which is preserved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Listing links point at the reader's page 1
(`.../uu/br_chapter-N/pg-1/`). The generic `derive_key_from_url` took
the last URL segment and returned `"pg-1"` for every chapter, so all
parsed chapters collapsed onto a single `chapter_sources` row downstream
and the first-manga chapter was the only row that survived. New
`derive_chapter_key_from_url` strips a trailing `/pg-\d+/` before
picking the chapter-identifying segment (`br_chapter-N` / `to_chapter-N`).
Notices, hiatus rows, and duplicate-numbered chapters are preserved as
distinct parser entries. The (manga_id, number) UNIQUE collapse in the
chapters table is a separate, follow-up concern handled in
feat/chapter-id-routing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- TargetSource: first concrete impl of the Source trait, modeled on
the old Puppeteer crawler's selectors (+ status normalization,
tag-count stripping, chapter list)
- DiscoverMode::Backfill walks pagination last->1, reverse within each
page (oldest-first); Incremental walks forward
- RateLimiter (tokio-time aware) plumbed through FetchContext so the
pagination walk honors the same per-host budget as the outer loop
- repo::crawler: ensure_source, upsert_manga_from_source (returns
New/Updated/Unchanged + current cover_image_path for backfill
decisions), sync_manga_chapters, mark_dropped_mangas — all
transactional, with case-insensitive lookups and source-insertable
genres
- Cover image download via reqwest+infer; stored under
mangas/{id}/cover.{ext} via the Storage trait
- Single CRAWLER_PROXY env wires both Chromium (--proxy-server) and
reqwest::Proxy::all (HTTP/HTTPS/SOCKS5)
- Crawler binary: positional start URL or $CRAWLER_START_URL,
$CRAWLER_LIMIT (cap fetches + skip drop pass on partial runs),
$CRAWLER_SKIP_CHAPTERS (disable selector AND sync), $CRAWLER_RATE_MS
- Silences chromiumoxide 0.7's known CDP deserialize log spam via
default tracing filter + CdpError::Serde downgrade
- 9 sqlx integration tests + 11 selector/rate-limit unit tests
$(top offset was 44px (header's 60px minus var(--space-4)), placing the bar inside the layout header. Now sticks at var(--app-header-h).)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reader gets chapter-aware chevrons + a persistent app frame +
distraction-free focus mode.
- Single-mode chevrons (and ArrowLeft/Right + j/k) advance pages
within the chapter and fall through to the adjacent chapter at the
boundaries. Last page of last chapter / first page of first
chapter disables the chevron and silent-no-ops on the keypress.
- Continuous-mode gets a fixed bottom bar with prev/next chapter
buttons; arrows + j/k jump chapters directly.
- `?page=N` and `?page=last` URL query lets the prev-chapter jump
land on the previous chapter's last page.
- Layout header is fixed at the top; reader nav is sticky just
below it; both stay visible while scrolling so reading settings
are always reachable.
- New "Focus" toggle in the reader nav hides the layout header,
reader nav, and bottom chapter bar with smooth 220ms slide
animations. Exit via Esc or a small floating Minimize2 button at
the top-right (low resting opacity, full on hover). Reset on
reader unmount so it doesn't leak to other pages.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- `/upload` is now manga-only with optional N initial chapters
staged inline.
- Additional chapters from a new `/manga/[id]/upload-chapter` route,
reached via an "Upload chapter" button on the manga page.
- New `ChapterPagesEditor` component: thumbnails next to each row,
click-to-preview-modal, drag-drop + reorder.
- Pages renamed to `page-NNN.<ext>` before multipart submission;
original filenames shown as dimmed reference text during upload
and dropped on submit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The profile overview's bookmark counter showed 0 even when the user had bookmarks because /me/bookmarks left page.total null. Repo now returns the count alongside the rows; handler uses with_total.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
$(addMangaToCollection crashed when the backend returned 201/200 with no body — the shared client only short-circuited 204. Now any empty body returns undefined.)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per-user reading progress and uploader attribution.
Schema (migration 0011): `read_progress` table (one row per (user,
manga); chapter_id nullable on chapter delete) and nullable
`uploaded_by` columns on mangas + chapters with partial indexes
scoped to non-null rows.
Endpoints (all `/me/*`, auth-scoped):
- PUT `/v1/me/read-progress` upserts. FK violations + cross-manga
chapter ids both surface as 4xx (404 / 422) so the API can't be
used to write logically invalid rows.
- GET `/v1/me/read-progress` paged newest-first list.
- GET `/v1/me/read-progress/:manga_id` enriched with chapter_number
for the manga page's Continue CTA.
- DELETE `/v1/me/read-progress/:manga_id` idempotent.
- GET `/v1/me/uploads` interleaved manga + chapter uploads as a
tagged union; limit-only pagination.
Existing manga + chapter upload handlers stamp `uploaded_by`.
Frontend:
- Reader emits progress on mount + page change (debounce) and via
IntersectionObserver in continuous mode. High-water mark is seeded
from the persisted server value so re-opening a chapter doesn't
regress to page 1. Tab close survives via `sendBeacon` (fallback
`keepalive` fetch); SPA navigation flushes via regular fetch.
- Manga detail page shows "Continue reading Chapter N — page M"
above the chapters list, working even for mangas with >50
chapters.
- New `/profile/history` tab with reading history (clear-per-row,
inline error on failure) and uploads (mangas + chapters mixed
chronologically with type-aware rendering).
171 backend tests (incl. 16 history tests covering ownership, FK
race, cross-link guard, chapter SET NULL behaviour) and 97 frontend
tests + svelte-check clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tabbed user dashboard at `/profile` that absorbs `/settings` and
surfaces bookmarks + collections in one place.
- New `/profile` shell with tabs: Overview (counts), Preferences
(theme + reader prefs, ported from /settings; works for guests
via localStorage), Account (password change; auth-gated),
Bookmarks, Collections. Guest tab list is filtered to what they
can actually use.
- `/settings` is a 308 redirect to `/profile/preferences` so old
bookmarks land cleanly. The "Settings" link in the top nav is
replaced by a Profile link between Upload and Bookmarks; Bookmarks
+ Collections stay as shortcuts per the user spec.
- Extracts `lib/components/BookmarkList.svelte` and
`lib/components/CollectionsGrid.svelte` so the top-level
/bookmarks + /collections routes and the new profile tabs render
the same UI without duplication. Both layers use a three-state
load (authenticated / guest / error) to handle network hiccups
inline.
- Deep links preserved via `?next=` on every sign-in CTA.
88 frontend unit tests + svelte-check clean; 12 of 12 e2e tests in
profile.spec.ts and reader-mode.spec.ts pass (8 other e2e failures
predate this branch and stay flagged for cleanup).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User-owned named lists of mangas with an add-to-collection modal on
the manga page and dedicated /collections and /collections/:id pages.
- Schema (0010): `collections` (per-user case-insensitive name
uniqueness) + `collection_mangas` join with cascade FKs.
- Endpoints: full CRUD on `/v1/collections`, idempotent add/remove
for `/v1/collections/:id/mangas`, and `/v1/mangas/:id/my-collections`
for the modal's pre-checked state. Owner-mismatch surfaces as 404
(not 403) so the API doesn't disclose collection existence to
non-owners; the frontend funnels 401 to /login. Three-state PATCH
via a new shared `domain::patch::Patch<T>` lets clients distinguish
"leave alone", "clear", and "set" for description.
- Frontend: reusable `Modal` component (focus trap, opt-in
backdrop close, ESC) and `AddToCollectionModal` with optimistic
toggling that's race-safe under fast clicks. /collections page
renders cover-collage cards; /collections/:id is editable with
per-card remove. Top nav gets a Collections link.
155 backend tests (incl. 21 collection tests covering ownership,
idempotence, sample-cover enrichment, three-state PATCH, FK race);
88 frontend tests; svelte-check clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- `GET /v1/authors/:id` returns `AuthorWithCount` (id, name, manga_count).
- `GET /v1/authors/:id/mangas` paged works by that author.
- `GET /v1/authors?search=` autocomplete (already used by Phase 1 forms;
now formally exposed).
- New `/authors/:id` page on the frontend; author chips on the manga
detail page (added in Phase 1) now link to a real page.
- Extracts `lib/components/MangaCard.svelte` — already used by the home
page, ready for the collection page in Phase 3.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds first-class manga metadata across the stack:
- **Status** (ongoing / completed), **alternative titles**, normalized
**multi-author** support, **curated genres** (13 seeded), and
**free-form user tags** (case-insensitive, globally shared). Each is
modelled as its own table joined to mangas; `mangas.author` is
backfilled into `authors` + `manga_authors` and dropped.
- New endpoints: `PATCH /v1/mangas/:id` (three-state `description`),
`POST/DELETE /v1/mangas/:id/tags[/:tag_id]`, `GET /v1/genres`,
`GET /v1/tags?search=`.
- `GET /v1/mangas` now returns `MangaCard` (with authors + genres
batched in) and supports `?status=`, `?author_id=`, `?genre_id=`,
`?tag_id=` filters — AND across facets, with empty-array no-op
semantics for the unnest primitive.
- `GET /v1/mangas/:id` returns the enriched `MangaDetail` with tags.
- Frontend: reusable `Chip` component; manga detail page renders
authors as chips linking to `/authors/:id` (Phase 2), a status
badge, alt titles, genres, and tags with inline add/remove (only
the attacher sees remove); upload form supports multi-author /
multi-genre / alt titles / status; search page gets a collapsible
URL-synced filter panel with keyboard-navigable tag autocomplete.
- 126 backend tests (incl. AND-across-facets primitive, case-insens
author/tag de-dup, transactional create rollback, PATCH semantics
for missing / null / set on description); 72 frontend tests +
svelte-check clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a vertical-scroll continuous mode to the reader alongside the
existing single-page mode. A segmented toggle in the reader top bar
switches between them; in continuous mode a gap selector
(None/Small/Medium/Large → 0/12/32/64px) controls the spacing
between stacked pages. Settings page mirrors the same controls.
Backend: new user_preferences table (one row per user, lazily
inserted, ON DELETE CASCADE) and GET/PATCH /api/v1/auth/me/preferences
gated by the existing CurrentUser extractor. Allowed values are
enforced both by API validation and table-level CHECK constraints.
Eight integration tests cover defaults, persistence, partial
updates, validation errors, auth, per-user isolation, and cascade.
Frontend: a new preferences store mirrors the theme-store pattern
with a localStorage shadow so anonymous browsers get a consistent
experience and logged-in users don't flash defaults while the
server response is in flight. Server values that the frontend
doesn't recognize (forward-compat) are ignored rather than poisoning
the UI; non-401 PATCH errors revert the optimistic local update;
logout clears the shadow so user A's settings don't follow user B
on a shared browser.
In continuous mode native scrolling handles Space/PageDown/arrows;
Home/End remain wired and call scrollIntoView() so jumping to chapter
bounds stays one keystroke. Single-page mode (chevrons, arrow-key
pagination, next-page preload) is unchanged.
Versions bumped 0.13.0 → 0.14.0 in lockstep.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a real design system to replace the per-route ad-hoc styling:
- docs/design-system.md is the contract. Semantic CSS custom-property
tokens (color/type/spacing/radii/shadows/z-index) with verified WCAG
AA/AAA contrast ratios for both themes.
- frontend/src/lib/styles/tokens.css defines :root tokens + a
[data-theme="dark"] override + base element resets, a .form-field
helper, and a global prefers-reduced-motion rule.
- frontend/src/lib/theme.svelte.ts is a Svelte 5 runes store backing
the theme state machine (system | light | dark). localStorage key
'mangalord-theme'; matchMedia subscription that re-resolves on OS
theme change while in 'system' mode; init() / destroy() lifecycle
wired from +layout.svelte.
- frontend/src/app.html runs a synchronous inline script before
%sveltekit.head% to set [data-theme] before first paint. No FOUC.
- /settings gains a System / Light / Dark radiogroup (real fieldset +
legend + radios with lucide icons).
- Every route's <style> block is rewritten to consume tokens — home,
auth, upload (drop-zone + page list), bookmarks, manga overview,
reader.
- @lucide/svelte icons replace ad-hoc text controls per the spec:
Search (icon-only primary), LogOut (icon-only muted), Upload /
Bookmarks / Settings nav inline icons, ChevronLeft/Right for the
reader, ArrowUp/Down/Trash2 for the upload page list. The bookmark
toggle keeps its '☆ Bookmark' / '★ Bookmarked' text verbatim.
- Home search controls split into two rows: input + Search CTA on
row 1, Sort (and future filters) on row 2.
Accessibility: every icon-only button carries aria-label, every
decorative SVG aria-hidden; existing image alt text preserved;
focus-visible rings reach every interactive element including the
visually-hidden theme radios; color is never the sole conveyor.
Version bump 0.12.0 → 0.13.0 across backend/Cargo.toml and
frontend/package.json (feat: → minor per CLAUDE.md).
Bars: svelte-check 0/0, vitest 51/51, playwright 18/18, cargo test
88/88, clippy -D warnings clean. Two rounds of independent review;
verdict ship-ready.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Every place that surfaced a manga title used to show *just* the
title — the home page, the reader's back-to-manga link, and the
chapter upload form's manga selector. Adding the cover image
alongside makes the app feel like an actual manga library.
- Home (`/`): manga list switched from a one-line `<a>` per item to
a responsive grid of cards (`auto-fill, minmax(140px, 1fr)`),
each card showing the cover (with 📖 placeholder when no cover is
set), the title (line-clamped to 2 rows), and the author.
- Reader (`/manga/[id]/chapter/[n]`): the back-to-manga link in the
reader header now shows a 28×42 thumbnail of the manga's cover next
to the title. Reuses the placeholder pattern for cover-less mangas.
- Upload (`/upload`): the chapter form's manga `<select>` still uses
a native dropdown (covers don't fit in `<option>`), but a preview
pane below the select now shows the currently-selected manga's
cover + title + author so the user can visually confirm which
manga they're attaching the chapter to.
No backend changes — `cover_image_path` was already in the Manga
JSON; only the frontend needed to read it.
Lockstep version bump to 0.12.0.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The bookmarks list was rendering "Manga bookmark <date>" with no
indication of which manga the bookmark referred to. The data is
already in the DB — the list query just wasn't pulling it.
Backend:
- BookmarkSummary gains manga_title (String) and
manga_cover_image_path (Option<String>). Populated by an INNER JOIN
on `mangas` in `repo::bookmark::list_for_user`. The JOIN is INNER
because `bookmarks.manga_id` has ON DELETE CASCADE, so a bookmark
cannot outlive its manga. Chapter LEFT JOIN unchanged.
- The existing list_me_enriches_chapter_bookmarks_with_chapter_number
test now also asserts manga_title is populated for both chapter-
and manga-level bookmarks, and that manga_cover_image_path is null
when no cover was uploaded.
Frontend:
- Bookmark type carries optional manga_title and
manga_cover_image_path (optional because POST /bookmarks returns
the bare Bookmark, not the enriched summary).
- /bookmarks page redesigned as a grid: cover thumbnail (64×96 with
a placeholder when no cover) on the left, then the manga title (as
the primary link), then either "Chapter N — page M" linked to the
reader, "(chapter removed)" for orphan chapter bookmarks, or
"Whole manga" for manga-level bookmarks. Bookmark date moves to a
subdued footer.
- E2E fixtures track the enriched shape returned by the list endpoint
(vs. the bare Bookmark returned by POST). The toggle test now
asserts the manga title appears on the bookmarks card after the
bookmark is created.
Also: tighten .gitignore. `/data` only catches the compose volume
root; the dev backend writes to `/backend/data` (default STORAGE_DIR
is `./data/storage` relative to backend cwd), so local uploads were
showing as untracked. Adding `/backend/data` keeps test uploads out
of the index.
Lockstep version bump to 0.11.1.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The 0.10.0 backend endpoint had no UI caller — the audit flagged it
as either-ship-a-form-or-remove-the-endpoint dead code. Shipping the
form, plus the bearer-token-keeps-working regression test the audit
asked for to pin the docstring contract.
Backend:
- New test change_password_via_bearer_leaves_bearer_working asserts
that PATCH /me/password called with Authorization: Bearer wipes
cookie sessions but leaves the bearer (api_token) intact and usable
— matches the docstring claim that bot tokens are opt-in to revoke.
Frontend:
- lib/api/auth.ts: new changePassword(input) wrapping PATCH
/v1/auth/me/password. Vitest covers happy 204, 401 unauthenticated
(wrong current), 400 invalid_input (weak new) — same envelope
parsing shape used elsewhere.
- routes/settings/+page.svelte: minimal form with current /
new / confirm fields, derived passwordsMatch + canSubmit guards
(submit stays disabled until current is filled, new is ≥8 chars,
new == confirm). Shows the API's message inline on failure.
Documents the "other devices signed out, bot tokens stay" UX in a
short hint.
- routes/+layout.svelte: new "Settings" link in the session-aware
nav (between username and Logout) for authed users only.
- e2e/settings.spec.ts (5 cases): nav link reaches the form,
successful change shows confirmation + clears the form, 401
surfaces inline, password mismatch keeps submit disabled, anonymous
user gets a sign-in prompt instead of the form.
Lockstep version bump to 0.11.0.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- F1: backend/Dockerfile now copies Cargo.lock alongside Cargo.toml
and builds with --locked, so the production image runs against the
exact crate versions CI tested. Without this, cargo silently
resolved fresh on each image build and "we tested it" stopped being
true for the binary you ship.
- F2: POST /api/v1/mangas/{id}/chapters rejects chapter `number < 1`
with 422 validation_failed. Mirrors the bookmark page>=1 rule from
0.9.4 — chapter numbers are 1-indexed everywhere (URLs, upload
form, reader) and 0/negative numbers had no legitimate use. Three
cases (0, -1, -100) in api_uploads.rs.
- F3: bookmarks/+page.ts no longer re-throws non-401 ApiErrors as
SvelteKit's generic 500 page. Surfaces the error message inline via
a new `data.error` field; the page renders an alert when present.
Same UX shape as the home page's existing error handling.
- F4: dropped Space from the reader keyboard binding. On portrait
phones and narrow desktop windows the page image overflows the
viewport and the user expects Space to scroll — preventDefaulting
it skipped past unread content. ArrowRight + j remain.
- New backend/.dockerignore and frontend/.dockerignore so the local
target/ and node_modules/ don't get shipped into the build context
on every `docker compose build`.
Lockstep version bump to 0.10.2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two small documentation gaps the second-pass audit flagged:
- CLAUDE.md described only the Vite dev proxy ("Vite dev-proxies to
the backend"), which left the production path opaque. Now lists
both: the Vite proxy for `npm run dev` and
`frontend/src/hooks.server.ts` for adapter-node. Same-origin cookie
story called out explicitly.
- `/api/v1/files/{key}` is an unauthenticated capability URL by
design — reads stay public, keys are unguessable v4 UUIDs, leaked
URL leaks one file. Documented both in `backend/src/api/files.rs`'s
module doc (with a pointer at the seam a future
feat/private-libraries branch would use) and in a new "Capability
URLs" section in README so a casual reader doesn't mistake the lack
of auth for an oversight.
No code or behaviour change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four small follow-ups from the second-pass audit:
- N1: `manga_upload_rolls_back_when_cover_storage_fails` covers the
manga-side of the transactional rollback path. The chapter case had
a `FailingStorage` regression test already; this completes the
symmetric pair. With fail-on-put-index=0, the cover put fails on
the first call, the transaction aborts, and `SELECT count(*) FROM
mangas WHERE title = 'Berserk'` is 0.
- N2: The SvelteKit proxy now catches network-layer failures from the
upstream `fetch` (DNS / connection refused / TLS handshake) and
returns a 502 with the standard error envelope
(`code: 'upstream_unavailable'`) instead of letting SvelteKit's
generic 500 HTML page through. `client.ts` can `.json()` the result
cleanly so callers see a real ApiError with a meaningful code. The
underlying cause is logged via `console.error` for the operator.
Test in hooks.server.test.ts asserts the 502, the JSON envelope, and
that `resolve` is not called (the proxy short-circuits).
- N3: `GET /api/v1/files/*key` now sets
`X-Content-Type-Options: nosniff`. The upload-time magic-byte sniff
is authoritative for what we declare as Content-Type; `nosniff`
makes the contract explicit so older user-agents can't try to
re-detect HTML/JS in a polyglot file that survived the sniff. Test
in api_uploads.rs asserts the header.
- N4: The /bookmarks page used `{#if b.page}` to gate the "— page N"
display, which falsy-elided a legitimate `page == 0`. Backend now
rejects `page < 1` for new bookmarks (already shipped in 0.9.4),
but any pre-0.9.4 row with page=0 still rendered without its
number. Strengthened to `{#if b.page != null && b.page > 0}`.
Lockstep version bump to 0.10.1.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>