Files
Mangalord/tor/torrc
MechaCat02 ecbbebafc4 feat(deploy): dockurr/tor service + torrc; wire crawler to use it by default
Adds a `tor` service to the compose stack (dockurr/tor) with a torrc
tuned for the crawler — SOCKS5 on 9050 with IsolateDestAddr +
IsolateDestPort so NEWNYM picks up promptly, control port on 9051
with cookie auth, MaxCircuitDirtiness 60.

Backend defaults CRAWLER_PROXY → socks5h://tor:9050 and
CRAWLER_TOR_CONTROL_URL → tcp://tor:9051 so TOR + recircuit are on
out-of-the-box. Operators can override both to empty in .env to opt
out without removing the service.

The tor-data named volume is mounted ro on the backend so it can read
/var/lib/tor/control_auth_cookie; CookieAuthFileGroupReadable handles
the permissions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 20:01:04 +02:00

33 lines
1.3 KiB
Plaintext

# torrc for the Mangalord crawler.
#
# Mounted into the dockurr/tor container at /etc/tor/torrc. The
# crawler talks to this daemon over the internal compose network only:
# `expose:` on the tor service surfaces 9050/9051 to sibling
# containers, never to the host.
# SOCKS5 proxy that reqwest and Chromium use. IsolateDestAddr +
# IsolateDestPort means each new (destination IP, port) draws a fresh
# circuit — so a SIGNAL NEWNYM picks up promptly on the next
# navigation instead of having to wait for an existing dirty circuit
# to age out.
SOCKSPort 0.0.0.0:9050 IsolateDestAddr IsolateDestPort
# Control port for SIGNAL NEWNYM. Cookie auth means no secret to manage
# in .env — the cookie file is created by the daemon at startup and
# shared with the backend container via the named `tor-data` volume.
# CookieAuthFileGroupReadable lets the backend's gid read it without
# having to run as root.
ControlPort 0.0.0.0:9051
CookieAuthentication 1
CookieAuthFile /var/lib/tor/control_auth_cookie
CookieAuthFileGroupReadable 1
# Keep circuits short-lived so NEWNYM actually changes our visible
# exit soon. Default is 600s (10 min); 60s is short enough that retries
# after a brief site rate-limit window almost always see a new IP.
MaxCircuitDirtiness 60
# Data + logs.
DataDirectory /var/lib/tor
Log notice stdout