Compare commits

..

1 Commits

Author SHA1 Message Date
MechaCat02
9b4f3525f6 feat: incremental crawl mode with seed-completion gate (0.33.0)
Daemon now auto-detects mode per source: Backfill until the first
full walk records `seed_completed:<source>` in `crawler_state`, then
Incremental (newest-first, stops after N consecutive Unchanged
upserts). `CRAWLER_MODE` overrides to a fixed mode; CLI rejects
`auto` since it has no pre-run DB state.

`Source::discover` returns a lazy `DiscoverWalk` so Incremental can
break out mid-walk without prefetching pages. The drop pass and seed
marker are now gated on a true full walk — fixes a latent soft-drop
of the index tail under partial sweeps.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 06:41:16 +02:00
8 changed files with 6 additions and 266 deletions

View File

@@ -1,23 +1,13 @@
# Copy to .env for `docker compose up --build`. Local-dev runs (cargo run
# / npm run dev) read backend/.env if present, or pick up the variables
# from your shell.
#
# Production note: COOKIE_SECURE=true (the default below) makes browsers
# refuse to send the session cookie over plain HTTP. Run with a TLS-
# terminating reverse proxy (Caddy, Traefik, nginx) in front — the
# compose file here doesn't ship one. Local/dev runs without HTTPS
# should set COOKIE_SECURE=false.
# ----- Postgres -----
# These are read by the Postgres container *and* by DATABASE_URL below;
# changing them after the first boot won't migrate existing data, so set
# them up front for any new deployment.
#
# POSTGRES_PASSWORD is REQUIRED — docker-compose.yml fails fast if it
# isn't set in this file, to prevent a deploy without an .env booting
# Postgres with a publicly-known credential.
POSTGRES_USER=mangalord
POSTGRES_PASSWORD=change-me-to-a-strong-random-string
POSTGRES_PASSWORD=mangalord
POSTGRES_DB=mangalord
# ----- Backend -----

View File

@@ -1,71 +0,0 @@
# Gitea Actions
The [`deploy`](workflows/deploy.yml) workflow runs on every push to `main`
(and via manual `workflow_dispatch`). It tests, builds, pushes the images
to a private registry, and rolls the stack over by SSH on the target host.
## Required secrets
Set under *Repo Settings → Actions → Secrets*:
| Name | Example | Purpose |
| -------------------- | ------------------------ | ---------------------------------------------------------------- |
| `REGISTRY_URL` | `registry.example.com` | Registry host. No scheme, no trailing slash. |
| `REGISTRY_USERNAME` | `mangalord-ci` | `docker login` user. |
| `REGISTRY_PASSWORD` | `<token>` | `docker login` token/password. |
| `SSH_HOST` | `mangalord.example.com` | Deploy target hostname/IP. |
| `SSH_USER` | `deploy` | SSH user on the target (must be in the `docker` group). |
| `SSH_PRIVATE_KEY` | `-----BEGIN OPENSSH...` | Private key authorised in the target user's `authorized_keys`. |
| `SSH_PORT` | `22` | Optional. Defaults to `22` if unset. |
## Required variables
Set under *Repo Settings → Actions → Variables* (not secrets — they appear
in logs):
| Name | Example | Purpose |
| ------------- | ------------------------ | ---------------------------------------------------------------------- |
| `DEPLOY_PATH` | `/srv/mangalord` | Directory on target holding `docker-compose.yml`, `.env`, and the prod overlay. |
## One-time host setup
The workflow assumes the deploy target already has:
1. Docker + Docker Compose v2 installed and the `SSH_USER` in the `docker` group.
2. `$DEPLOY_PATH/docker-compose.yml` (copy of the repo's [docker-compose.yml](../docker-compose.yml)).
3. `$DEPLOY_PATH/docker-compose.prod.yml` (copy of the repo's [docker-compose.prod.yml](../docker-compose.prod.yml)).
4. `$DEPLOY_PATH/.env` populated from [.env.example](../.env.example) with production values (real `POSTGRES_PASSWORD`, `COOKIE_SECURE=true`, etc.).
Bootstrap once:
```bash
ssh deploy@mangalord.example.com
sudo mkdir -p /srv/mangalord && sudo chown deploy:deploy /srv/mangalord
cd /srv/mangalord
# place docker-compose.yml, docker-compose.prod.yml, and .env here
```
The first workflow run will pull the images, bring the stack up, and run
the embedded migrations on startup.
## Image tags
Every push produces three tags per image:
- `mangalord-{backend,frontend}:latest`
- `mangalord-{backend,frontend}:<git-sha>` — used by the deploy job; lets
you pin a deploy to a specific commit
- `mangalord-{backend,frontend}:<version>` — the version from
[backend/Cargo.toml](../backend/Cargo.toml) (verified in lockstep with
[frontend/package.json](../frontend/package.json))
## Rollback
SSH to the target, set `IMAGE_TAG` to a previous commit SHA, and re-up:
```bash
cd /srv/mangalord
export REGISTRY_URL=registry.example.com
export IMAGE_TAG=<previous-sha>
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d
```

View File

@@ -1,151 +0,0 @@
name: deploy
on:
push:
branches: [main]
pull_request:
branches: [main]
workflow_dispatch:
jobs:
test-backend:
runs-on: ubuntu-latest
container:
image: rust:1-slim
services:
postgres:
image: postgres:16-alpine
env:
POSTGRES_USER: mangalord
POSTGRES_PASSWORD: mangalord
POSTGRES_DB: mangalord
options: >-
--health-cmd "pg_isready -U mangalord"
--health-interval 5s
--health-timeout 5s
--health-retries 10
env:
DATABASE_URL: postgres://mangalord:mangalord@postgres:5432/mangalord
steps:
- uses: actions/checkout@v4
- name: Install build deps
run: |
apt-get update
apt-get install -y --no-install-recommends pkg-config libssl-dev ca-certificates
- name: Cache cargo registry and target
uses: actions/cache@v4
with:
path: |
~/.cargo/registry
~/.cargo/git
backend/target
key: cargo-${{ runner.os }}-${{ hashFiles('backend/Cargo.lock') }}
restore-keys: |
cargo-${{ runner.os }}-
- name: cargo test
working-directory: backend
run: cargo test --locked
test-frontend:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '22'
cache: npm
cache-dependency-path: frontend/package-lock.json
- name: npm ci
working-directory: frontend
run: npm ci
- name: vitest
working-directory: frontend
run: npm test
build-and-push:
runs-on: ubuntu-latest
needs: [test-backend, test-frontend]
# PRs only run the test jobs; build + deploy are reserved for
# post-merge pushes to main. Without this gate every PR would push
# a tagged image to the registry and SSH-deploy to prod.
if: github.event_name != 'pull_request'
outputs:
image_tag: ${{ steps.meta.outputs.image_tag }}
version: ${{ steps.meta.outputs.version }}
steps:
- uses: actions/checkout@v4
- name: Resolve image tags
id: meta
run: |
version="$(grep -m1 '^version' backend/Cargo.toml | cut -d'"' -f2)"
frontend_version="$(grep -m1 '"version"' frontend/package.json | cut -d'"' -f4)"
if [ "$version" != "$frontend_version" ]; then
echo "Version mismatch: backend=$version frontend=$frontend_version" >&2
exit 1
fi
echo "image_tag=${GITHUB_SHA}" >> "$GITHUB_OUTPUT"
echo "version=${version}" >> "$GITHUB_OUTPUT"
- uses: docker/setup-buildx-action@v3
- name: docker login
uses: docker/login-action@v3
with:
registry: ${{ secrets.REGISTRY_URL }}
username: ${{ secrets.REGISTRY_USERNAME }}
password: ${{ secrets.REGISTRY_PASSWORD }}
- name: Build & push backend
uses: docker/build-push-action@v5
with:
context: ./backend
push: true
tags: |
${{ secrets.REGISTRY_URL }}/mangalord-backend:latest
${{ secrets.REGISTRY_URL }}/mangalord-backend:${{ steps.meta.outputs.image_tag }}
${{ secrets.REGISTRY_URL }}/mangalord-backend:${{ steps.meta.outputs.version }}
cache-from: type=gha,scope=backend
cache-to: type=gha,mode=max,scope=backend
- name: Build & push frontend
uses: docker/build-push-action@v5
with:
context: ./frontend
push: true
tags: |
${{ secrets.REGISTRY_URL }}/mangalord-frontend:latest
${{ secrets.REGISTRY_URL }}/mangalord-frontend:${{ steps.meta.outputs.image_tag }}
${{ secrets.REGISTRY_URL }}/mangalord-frontend:${{ steps.meta.outputs.version }}
cache-from: type=gha,scope=frontend
cache-to: type=gha,mode=max,scope=frontend
deploy:
runs-on: ubuntu-latest
needs: build-and-push
if: github.event_name != 'pull_request'
steps:
- name: SSH deploy
uses: appleboy/ssh-action@v1.0.3
with:
host: ${{ secrets.SSH_HOST }}
username: ${{ secrets.SSH_USER }}
key: ${{ secrets.SSH_PRIVATE_KEY }}
port: ${{ secrets.SSH_PORT || 22 }}
envs: REGISTRY_URL,REGISTRY_USERNAME,REGISTRY_PASSWORD,IMAGE_TAG,DEPLOY_PATH
script_stop: true
script: |
set -euo pipefail
cd "$DEPLOY_PATH"
echo "$REGISTRY_PASSWORD" | docker login "$REGISTRY_URL" -u "$REGISTRY_USERNAME" --password-stdin
export REGISTRY_URL IMAGE_TAG
docker compose -f docker-compose.yml -f docker-compose.prod.yml pull
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d
docker image prune -f
docker logout "$REGISTRY_URL"
env:
REGISTRY_URL: ${{ secrets.REGISTRY_URL }}
REGISTRY_USERNAME: ${{ secrets.REGISTRY_USERNAME }}
REGISTRY_PASSWORD: ${{ secrets.REGISTRY_PASSWORD }}
IMAGE_TAG: ${{ needs.build-and-push.outputs.image_tag }}
DEPLOY_PATH: ${{ vars.DEPLOY_PATH }}

2
backend/Cargo.lock generated
View File

@@ -1470,7 +1470,7 @@ checksum = "c41e0c4fef86961ac6d6f8a82609f55f31b05e4fce149ac5710e439df7619ba4"
[[package]]
name = "mangalord"
version = "0.34.0"
version = "0.33.0"
dependencies = [
"anyhow",
"argon2",

View File

@@ -1,6 +1,6 @@
[package]
name = "mangalord"
version = "0.34.0"
version = "0.33.0"
edition = "2021"
default-run = "mangalord"

View File

@@ -1,22 +0,0 @@
# Production overlay: layer on top of docker-compose.yml on the deploy
# host so the backend and frontend run from pre-built registry images
# instead of building locally.
#
# docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d
#
# REGISTRY_URL and IMAGE_TAG are injected by .gitea/workflows/deploy.yml
# at deploy time. IMAGE_TAG defaults to `latest` so a manual
# `docker compose ... up -d` on the host still works.
services:
backend:
build: !reset null
image: ${REGISTRY_URL}/mangalord-backend:${IMAGE_TAG:-latest}
pull_policy: always
restart: unless-stopped
frontend:
build: !reset null
image: ${REGISTRY_URL}/mangalord-frontend:${IMAGE_TAG:-latest}
pull_policy: always
restart: unless-stopped

View File

@@ -1,15 +1,9 @@
# Production-like compose. Requires a populated `.env` next to this
# file: at minimum POSTGRES_PASSWORD must be set to a non-default
# value (the `?required` form below fails fast otherwise). The
# frontend container expects HTTPS in front (Caddy/Traefik/nginx)
# because COOKIE_SECURE=true browsers will refuse to send the session
# cookie over plain HTTP.
services:
postgres:
image: postgres:16-alpine
environment:
POSTGRES_USER: ${POSTGRES_USER:-mangalord}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:?POSTGRES_PASSWORD must be set in .env}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-mangalord}
POSTGRES_DB: ${POSTGRES_DB:-mangalord}
volumes:
- postgres-data:/var/lib/postgresql/data
@@ -25,7 +19,7 @@ services:
postgres:
condition: service_healthy
environment:
DATABASE_URL: postgres://${POSTGRES_USER:-mangalord}:${POSTGRES_PASSWORD:?POSTGRES_PASSWORD must be set in .env}@postgres:5432/${POSTGRES_DB:-mangalord}
DATABASE_URL: postgres://${POSTGRES_USER:-mangalord}:${POSTGRES_PASSWORD:-mangalord}@postgres:5432/${POSTGRES_DB:-mangalord}
BIND_ADDRESS: 0.0.0.0:8080
STORAGE_DIR: /var/lib/mangalord/storage
RUST_LOG: ${RUST_LOG:-info,mangalord=debug}

View File

@@ -1,6 +1,6 @@
{
"name": "mangalord-frontend",
"version": "0.34.0",
"version": "0.33.0",
"private": true,
"type": "module",
"scripts": {