Files
PiCloud/crates/manager-core/migrations/0010_dead_letters.sql
MechaCat02 545d863199 feat(v1.1.1-triggers): triggers + outbox schema + repos
Migrations 0008-0011 lay down the triggers framework's storage:

- `triggers` + `kv_trigger_details` + `dead_letter_trigger_details`
  (Layout E, design notes §2). Parent table carries common columns
  including `registered_by_principal` — the dispatcher uses this to
  run the trigger as the user that registered it (design notes §4).
- `outbox`: universal async dispatch substrate. KV/cron/pubsub/queue/
  email/dead-letter all write rows in the same shape; the dispatcher
  claims due rows via FOR UPDATE SKIP LOCKED. `reply_to` is the
  NATS-style inbox id for sync HTTP (commit 6) — its presence flags
  "don't retry" per the design.
- `dead_letters`: exact schema from design notes §4 with the four-
  value `resolution` CHECK constraint (`replayed | ignored |
  handled_by_script | handler_failed`) and partial index on
  unresolved rows for the dashboard badge.
- `abandoned_executions`: forensic table for the dispatcher's
  "tried to resolve a dropped inbox" edge case (design notes §3 #9).

Repo surfaces with Postgres impls behind traits so unit tests can
swap in-memory backings:
- `TriggerRepo` — CRUD + the `list_matching_kv` /
  `list_matching_dead_letter` hot paths the dispatcher uses.
  Includes a `collection_matches` helper that handles `*`, `prefix:*`,
  and exact-name globs.
- `OutboxRepo` — insert + claim-due + delete + reschedule.
- `DeadLetterRepo` — insert + get + list + unresolved-count +
  resolve + GC.
- `AbandonedRepo` — insert + GC.

`TriggerConfig::from_env` (new module) follows the existing
`SandboxCeiling` env-loading pattern for `PICLOUD_MAX_TRIGGER_DEPTH`,
`PICLOUD_TRIGGER_RETRY_*`, `PICLOUD_DEAD_LETTER_RETENTION_DAYS`, and
`PICLOUD_ABANDONED_EXECUTIONS_RETENTION_DAYS`.

`Capability::AppManageTriggers(AppId)` and `AppDeadLetterManage(AppId)`
join the enum. Both map onto the existing `Scope::AppAdmin` per the
seven-scope commitment; `role_satisfies` grants them at the
`AppAdmin` per-app role.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-01 21:46:45 +02:00

51 lines
2.0 KiB
SQL

-- v1.1.1: dead_letters — design notes §4.
--
-- Async invocations that exhaust their retry policy land here. Each
-- row carries the original event payload verbatim plus the attempt
-- history so handlers (registered via `dead_letter` triggers) and the
-- dashboard can decide what to do.
--
-- Schema mirrors design notes §4. The CHECK constraint on
-- `resolution` enforces the closed vocabulary used by both the SDK
-- (`dead_letters::resolve(id, reason)`) and the recursion-stop rule
-- (`handler_failed`). Sync HTTP failures (`reply_to.is_some()`) never
-- land here — they're served via the inbox channel.
--
-- Indexes:
-- - partial index on unresolved rows: the dashboard's
-- unresolved-count badge query (`COUNT(*) WHERE app_id = $1 AND
-- resolved_at IS NULL`).
-- - GC index on `created_at`: the weekly retention sweep.
CREATE TABLE dead_letters (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
app_id UUID NOT NULL REFERENCES apps(id) ON DELETE CASCADE,
-- The outbox.id row that exhausted retries. The outbox row itself
-- has been deleted at this point.
original_event_id UUID NOT NULL,
source TEXT NOT NULL,
op TEXT NOT NULL,
-- Nullable because direct admin replays may have no trigger row.
trigger_id UUID,
script_id UUID,
payload JSONB NOT NULL,
attempt_count INT NOT NULL,
first_attempt_at TIMESTAMPTZ NOT NULL,
last_attempt_at TIMESTAMPTZ NOT NULL,
last_error TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
resolved_at TIMESTAMPTZ,
resolution TEXT
CHECK (resolution IN
('replayed', 'ignored', 'handled_by_script', 'handler_failed'))
);
-- Dashboard unresolved-count badge — partial index on the predicate
-- the query uses.
CREATE INDEX idx_dead_letters_app_unresolved
ON dead_letters (app_id)
WHERE resolved_at IS NULL;
-- GC sweep scans by creation time.
CREATE INDEX idx_dead_letters_gc ON dead_letters (created_at);