Dockurr/tor's stock entrypoint binds the control port to localhost (unreachable from a sibling container), refuses to run as a non-default user (its setup chowns dirs and su-execs down to its `tor` user, both requiring root), and skips its own HashedControlPassword injection whenever the user's torrc declares a ControlPort. The combination meant the original cookie-via-shared- volume design couldn't work without fighting the image. This commit: - Adds tor/entrypoint.sh, a small wrapper that hashes $PASSWORD with `tor --hash-password`, appends the hash to a writable copy of /etc/tor/torrc, then execs tor. Container runs as root only for that bring-up; the torrc's `User tor` directive drops privs after port binding. - Adds a healthcheck on the tor service that gates downstream containers on both 9050 + 9051 actually listening (was service_started, which fires before tor finishes bootstrap). - Loosens MaxCircuitDirtiness 60 → 600. The 60s value would have rotated mid-chapter for any chapter with > ~50 images, which is exactly the kind of fingerprint we're trying to avoid. - Wires TOR_CONTROL_PASSWORD as a REQUIRED .env var on both sides (PASSWORD on tor, CRAWLER_TOR_CONTROL_PASSWORD on backend). docker-compose.yml fails fast if unset. - Removes the tor-data shared volume on backend (cookie auth is no longer the default; operators wanting cookie can mount it back). - Documents the pivot + the cookie-vs-password tradeoff in .env.example. End-to-end validated: `docker compose up -d tor`, then `printf 'AUTHENTICATE "test"\r\nSIGNAL NEWNYM\r\nQUIT\r\n' | nc tor 9051` returns three `250 OK` lines. Audit ref: #2, #3, #6. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
135 lines
6.3 KiB
Plaintext
135 lines
6.3 KiB
Plaintext
# Copy to .env for `docker compose up --build`. Local-dev runs (cargo run
|
|
# / npm run dev) read backend/.env if present, or pick up the variables
|
|
# from your shell.
|
|
#
|
|
# Production note: COOKIE_SECURE=true (the default below) makes browsers
|
|
# refuse to send the session cookie over plain HTTP. Run with a TLS-
|
|
# terminating reverse proxy (Caddy, Traefik, nginx) in front — the
|
|
# compose file here doesn't ship one. Local/dev runs without HTTPS
|
|
# should set COOKIE_SECURE=false.
|
|
|
|
# ----- Postgres -----
|
|
# These are read by the Postgres container *and* by DATABASE_URL below;
|
|
# changing them after the first boot won't migrate existing data, so set
|
|
# them up front for any new deployment.
|
|
#
|
|
# POSTGRES_PASSWORD is REQUIRED — docker-compose.yml fails fast if it
|
|
# isn't set in this file, to prevent a deploy without an .env booting
|
|
# Postgres with a publicly-known credential.
|
|
POSTGRES_USER=mangalord
|
|
POSTGRES_PASSWORD=change-me-to-a-strong-random-string
|
|
POSTGRES_DB=mangalord
|
|
|
|
# ----- Backend -----
|
|
DATABASE_URL=postgres://mangalord:mangalord@postgres:5432/mangalord
|
|
BIND_ADDRESS=0.0.0.0:8080
|
|
STORAGE_DIR=/var/lib/mangalord/storage
|
|
RUST_LOG=info,mangalord=debug,chromiumoxide::conn=off,chromiumoxide::handler=off
|
|
|
|
# ----- Auth / cookies -----
|
|
# COOKIE_SECURE controls whether the `Secure` flag is set on the session
|
|
# cookie. Keep `true` in production (HTTPS); set to `false` if you're
|
|
# serving over plain HTTP locally (e.g., behind a dev reverse proxy).
|
|
COOKIE_SECURE=true
|
|
# COOKIE_DOMAIN scopes the session cookie. Leave empty to default to the
|
|
# requesting host. Set when serving the API and frontend on subdomains of
|
|
# a shared parent (e.g., `.example.com`) so the cookie is shared.
|
|
COOKIE_DOMAIN=
|
|
# Session lifetime in days. Expired sessions are no longer accepted and
|
|
# get reaped lazily.
|
|
SESSION_TTL_DAYS=30
|
|
|
|
# ----- Auth brute-force rate limits -----
|
|
# Token-bucket budget shared across /auth/login, /auth/register, and
|
|
# /auth/me/password. Set per_sec=0 to disable (e.g. behind a
|
|
# rate-limiting reverse proxy that already enforces a budget).
|
|
AUTH_RATE_PER_SEC=5
|
|
AUTH_RATE_BURST=10
|
|
|
|
# ----- CORS -----
|
|
# Comma-separated origins allowed to call the API with credentials.
|
|
# Default is empty: same-origin only. Set when frontend and backend live
|
|
# on different hosts. Example: https://app.example.com,https://app.example.de
|
|
CORS_ALLOWED_ORIGINS=
|
|
|
|
# ----- Upload limits -----
|
|
# Per-request body cap. axum rejects oversized requests with 413 before
|
|
# our handlers run. Default 200 MiB.
|
|
MAX_REQUEST_BYTES=209715200
|
|
# Per-image-part cap. Enforced after reading each part, so a single
|
|
# oversized image is rejected even when the total request fits.
|
|
# Default 20 MiB.
|
|
MAX_FILE_BYTES=20971520
|
|
|
|
# ----- Crawler download safety -----
|
|
# Hosts the crawler is allowed to fetch images/covers from, in addition
|
|
# to CRAWLER_START_URL's host and CRAWLER_CDN_HOST. Comma-separated.
|
|
# Defends against SSRF via scraped <img src="http://10.0.0.1/...">.
|
|
CRAWLER_DOWNLOAD_ALLOWLIST=
|
|
# Bypass the host allowlist entirely. Intended for sources that shard
|
|
# images across numbered CDN subdomains (cdn1/cdn2/…) where enumerating
|
|
# every host upfront is impractical. The private-IP / localhost / non-
|
|
# http(s) scheme defenses STAY ON — a scraped <img src="http://10.0.0.1/">
|
|
# is still refused with this flag set.
|
|
CRAWLER_ALLOW_ANY_HOST=false
|
|
# Hard cap on a single image body. Default 32 MiB.
|
|
CRAWLER_MAX_IMAGE_BYTES=33554432
|
|
# Path to a system Chromium binary. When set, the crawler skips the
|
|
# bundled-fetcher download. Required on platforms without a usable
|
|
# upstream Chromium build (notably Linux_arm64 / Raspberry Pi). On
|
|
# Debian: /usr/bin/chromium-headless-shell or /usr/bin/chromium. On
|
|
# Ubuntu the package is chromium-browser (different path). Pair with
|
|
# `docker compose build --build-arg INSTALL_CHROMIUM=true backend` so
|
|
# the image actually contains the binary.
|
|
CRAWLER_CHROMIUM_BINARY=
|
|
|
|
# ----- Crawler TOR proxy + recircuit -----
|
|
# The compose stack ships a `tor` service (dockurr/tor) and defaults
|
|
# CRAWLER_PROXY to it, so by default all crawler traffic exits via the
|
|
# TOR network. To opt out, set CRAWLER_PROXY= (empty) AND
|
|
# CRAWLER_TOR_CONTROL_URL= (empty) below — the tor service can stay
|
|
# running, it just won't be used.
|
|
#
|
|
# Going through TOR adds latency to every fetch; image downloads in
|
|
# particular slow noticeably. The win is on sites that rate-limit or
|
|
# fingerprint by exit IP — NEWNYM recirculation makes a fresh exit
|
|
# cheap to reach for.
|
|
#
|
|
# CRAWLER_PROXY: SOCKS5(h) URL. Use `socks5h://` (not `socks5://`) so
|
|
# DNS resolution also goes through TOR, avoiding leaks via the host's
|
|
# resolver. Leave unset to talk to the upstream directly.
|
|
CRAWLER_PROXY=socks5h://tor:9050
|
|
# Control-port URL for SIGNAL NEWNYM ("get a fresh circuit"). Triggered
|
|
# automatically on bad pages (broken-page body, missing #logo) and on
|
|
# the Unauthenticated session probe outcome. Leave unset to disable
|
|
# the recircuit feature (the SOCKS proxy still works).
|
|
CRAWLER_TOR_CONTROL_URL=tcp://tor:9051
|
|
# Max NEWNYM-and-retry cycles per recircuit-eligible failure. Default 3.
|
|
CRAWLER_TOR_RECIRCUIT_MAX_ATTEMPTS=3
|
|
|
|
# ----- TOR control-port password -----
|
|
# Shared between the bundled dockurr/tor service (which hashes it into
|
|
# its HashedControlPassword) and the backend's
|
|
# CRAWLER_TOR_CONTROL_PASSWORD. REQUIRED — docker-compose.yml fails
|
|
# fast if absent. Generate a strong random string; rotate by setting
|
|
# a new value and restarting both `tor` and `backend`.
|
|
#
|
|
# Operators running their own non-dockurr tor daemon with cookie-file
|
|
# auth can ignore this var and instead set
|
|
# CRAWLER_TOR_CONTROL_COOKIE_PATH on the backend — the TorController
|
|
# prefers cookie when both are present.
|
|
TOR_CONTROL_PASSWORD=change-me-to-a-strong-random-string
|
|
|
|
# ----- Frontend -----
|
|
# The frontend container runs SvelteKit's Node adapter on :3000 and
|
|
# proxies /api/* to BACKEND_URL via src/hooks.server.ts. In compose the
|
|
# default `http://backend:8080` reaches the backend service over the
|
|
# internal docker network. Override only if you're running the
|
|
# frontend container against a backend somewhere else.
|
|
BACKEND_URL=http://backend:8080
|
|
# Per-request wall-clock cap for the /api/* reverse proxy (milliseconds).
|
|
# Default 300000 (5 min) covers a typical 200 MiB chapter upload over
|
|
# 25 Mbps; raise for users on slower upstream links or lower if a
|
|
# tighter front proxy already bounds the request lifetime.
|
|
BACKEND_PROXY_TIMEOUT_MS=300000
|