fix(deploy): pivot tor service to password auth + wrapper entrypoint
Dockurr/tor's stock entrypoint binds the control port to localhost (unreachable from a sibling container), refuses to run as a non-default user (its setup chowns dirs and su-execs down to its `tor` user, both requiring root), and skips its own HashedControlPassword injection whenever the user's torrc declares a ControlPort. The combination meant the original cookie-via-shared- volume design couldn't work without fighting the image. This commit: - Adds tor/entrypoint.sh, a small wrapper that hashes $PASSWORD with `tor --hash-password`, appends the hash to a writable copy of /etc/tor/torrc, then execs tor. Container runs as root only for that bring-up; the torrc's `User tor` directive drops privs after port binding. - Adds a healthcheck on the tor service that gates downstream containers on both 9050 + 9051 actually listening (was service_started, which fires before tor finishes bootstrap). - Loosens MaxCircuitDirtiness 60 → 600. The 60s value would have rotated mid-chapter for any chapter with > ~50 images, which is exactly the kind of fingerprint we're trying to avoid. - Wires TOR_CONTROL_PASSWORD as a REQUIRED .env var on both sides (PASSWORD on tor, CRAWLER_TOR_CONTROL_PASSWORD on backend). docker-compose.yml fails fast if unset. - Removes the tor-data shared volume on backend (cookie auth is no longer the default; operators wanting cookie can mount it back). - Documents the pivot + the cookie-vs-password tradeoff in .env.example. End-to-end validated: `docker compose up -d tor`, then `printf 'AUTHENTICATE "test"\r\nSIGNAL NEWNYM\r\nQUIT\r\n' | nc tor 9051` returns three `250 OK` lines. Audit ref: #2, #3, #6. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
27
.env.example
27
.env.example
@@ -90,23 +90,36 @@ CRAWLER_CHROMIUM_BINARY=
|
||||
# CRAWLER_TOR_CONTROL_URL= (empty) below — the tor service can stay
|
||||
# running, it just won't be used.
|
||||
#
|
||||
# Going through TOR adds latency to every fetch; image downloads in
|
||||
# particular slow noticeably. The win is on sites that rate-limit or
|
||||
# fingerprint by exit IP — NEWNYM recirculation makes a fresh exit
|
||||
# cheap to reach for.
|
||||
#
|
||||
# CRAWLER_PROXY: SOCKS5(h) URL. Use `socks5h://` (not `socks5://`) so
|
||||
# DNS resolution also goes through TOR, avoiding leaks via the host's
|
||||
# resolver. Leave unset to talk to the upstream directly.
|
||||
CRAWLER_PROXY=socks5h://tor:9050
|
||||
# Control-port URL for SIGNAL NEWNYM ("get a fresh circuit"). Triggered
|
||||
# automatically on bad pages (broken-page body, missing #logo) and on
|
||||
# the Unauthenticated session probe outcome. Leave unset to disable the
|
||||
# recircuit feature (the SOCKS proxy still works).
|
||||
# the Unauthenticated session probe outcome. Leave unset to disable
|
||||
# the recircuit feature (the SOCKS proxy still works).
|
||||
CRAWLER_TOR_CONTROL_URL=tcp://tor:9051
|
||||
# Auth — cookie file (preferred) or password (HashedControlPassword).
|
||||
# Cookie wins when both are set. The bundled torrc enables cookie auth
|
||||
# and shares /var/lib/tor between containers via a named volume.
|
||||
CRAWLER_TOR_CONTROL_COOKIE_PATH=/var/lib/tor/control_auth_cookie
|
||||
# CRAWLER_TOR_CONTROL_PASSWORD=
|
||||
# Max NEWNYM-and-retry cycles per recircuit-eligible failure. Default 3.
|
||||
CRAWLER_TOR_RECIRCUIT_MAX_ATTEMPTS=3
|
||||
|
||||
# ----- TOR control-port password -----
|
||||
# Shared between the bundled dockurr/tor service (which hashes it into
|
||||
# its HashedControlPassword) and the backend's
|
||||
# CRAWLER_TOR_CONTROL_PASSWORD. REQUIRED — docker-compose.yml fails
|
||||
# fast if absent. Generate a strong random string; rotate by setting
|
||||
# a new value and restarting both `tor` and `backend`.
|
||||
#
|
||||
# Operators running their own non-dockurr tor daemon with cookie-file
|
||||
# auth can ignore this var and instead set
|
||||
# CRAWLER_TOR_CONTROL_COOKIE_PATH on the backend — the TorController
|
||||
# prefers cookie when both are present.
|
||||
TOR_CONTROL_PASSWORD=change-me-to-a-strong-random-string
|
||||
|
||||
# ----- Frontend -----
|
||||
# The frontend container runs SvelteKit's Node adapter on :3000 and
|
||||
# proxies /api/* to BACKEND_URL via src/hooks.server.ts. In compose the
|
||||
|
||||
Reference in New Issue
Block a user