Dockurr/tor's stock entrypoint binds the control port to localhost (unreachable from a sibling container), refuses to run as a non-default user (its setup chowns dirs and su-execs down to its `tor` user, both requiring root), and skips its own HashedControlPassword injection whenever the user's torrc declares a ControlPort. The combination meant the original cookie-via-shared- volume design couldn't work without fighting the image. This commit: - Adds tor/entrypoint.sh, a small wrapper that hashes $PASSWORD with `tor --hash-password`, appends the hash to a writable copy of /etc/tor/torrc, then execs tor. Container runs as root only for that bring-up; the torrc's `User tor` directive drops privs after port binding. - Adds a healthcheck on the tor service that gates downstream containers on both 9050 + 9051 actually listening (was service_started, which fires before tor finishes bootstrap). - Loosens MaxCircuitDirtiness 60 → 600. The 60s value would have rotated mid-chapter for any chapter with > ~50 images, which is exactly the kind of fingerprint we're trying to avoid. - Wires TOR_CONTROL_PASSWORD as a REQUIRED .env var on both sides (PASSWORD on tor, CRAWLER_TOR_CONTROL_PASSWORD on backend). docker-compose.yml fails fast if unset. - Removes the tor-data shared volume on backend (cookie auth is no longer the default; operators wanting cookie can mount it back). - Documents the pivot + the cookie-vs-password tradeoff in .env.example. End-to-end validated: `docker compose up -d tor`, then `printf 'AUTHENTICATE "test"\r\nSIGNAL NEWNYM\r\nQUIT\r\n' | nc tor 9051` returns three `250 OK` lines. Audit ref: #2, #3, #6. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
41 lines
1.6 KiB
Bash
Executable File
41 lines
1.6 KiB
Bash
Executable File
#!/bin/sh
|
|
# Mangalord wrapper around dockurr/tor's tor binary.
|
|
#
|
|
# We bypass the image's stock entrypoint for two reasons:
|
|
# 1. It generates a `ControlPort 9051` line that binds to localhost
|
|
# only (tor's default), but our backend lives in a separate
|
|
# container and needs to reach 0.0.0.0:9051.
|
|
# 2. It then *skips* writing HashedControlPassword whenever the
|
|
# user's torrc declares a ControlPort, so we can't both bind to
|
|
# 0.0.0.0 and benefit from its auto-hashing — it's one or the
|
|
# other. Doing the hashing ourselves is simpler than threading
|
|
# around its logic.
|
|
#
|
|
# This wrapper hashes $PASSWORD with `tor --hash-password`, appends a
|
|
# `HashedControlPassword` line to a writable copy of /etc/tor/torrc,
|
|
# then execs tor. Container runs as root (image default); tor binds
|
|
# 9050/9051 which don't require root and is fine inside a single-
|
|
# purpose container.
|
|
|
|
set -eu
|
|
|
|
if [ -z "${PASSWORD:-}" ]; then
|
|
echo "ERROR: PASSWORD env must be set (the plain string the backend will" >&2
|
|
echo " send as CRAWLER_TOR_CONTROL_PASSWORD)" >&2
|
|
exit 1
|
|
fi
|
|
|
|
# `tor --hash-password` prints the hash on the last line of stdout
|
|
# (preceded by initialization noise).
|
|
HASH=$(tor --hash-password "$PASSWORD" 2>/dev/null | tail -n1)
|
|
if [ -z "$HASH" ]; then
|
|
echo "ERROR: 'tor --hash-password' produced no output" >&2
|
|
exit 1
|
|
fi
|
|
|
|
# /etc/tor/torrc is bind-mounted read-only, so copy + append.
|
|
cp /etc/tor/torrc /tmp/torrc
|
|
printf '\n# Injected by mangalord-entrypoint.sh from $PASSWORD env.\nHashedControlPassword %s\n' "$HASH" >> /tmp/torrc
|
|
|
|
exec tor -f /tmp/torrc
|