feat(v1.1.1-triggers): triggers + outbox schema + repos
Migrations 0008-0011 lay down the triggers framework's storage: - `triggers` + `kv_trigger_details` + `dead_letter_trigger_details` (Layout E, design notes §2). Parent table carries common columns including `registered_by_principal` — the dispatcher uses this to run the trigger as the user that registered it (design notes §4). - `outbox`: universal async dispatch substrate. KV/cron/pubsub/queue/ email/dead-letter all write rows in the same shape; the dispatcher claims due rows via FOR UPDATE SKIP LOCKED. `reply_to` is the NATS-style inbox id for sync HTTP (commit 6) — its presence flags "don't retry" per the design. - `dead_letters`: exact schema from design notes §4 with the four- value `resolution` CHECK constraint (`replayed | ignored | handled_by_script | handler_failed`) and partial index on unresolved rows for the dashboard badge. - `abandoned_executions`: forensic table for the dispatcher's "tried to resolve a dropped inbox" edge case (design notes §3 #9). Repo surfaces with Postgres impls behind traits so unit tests can swap in-memory backings: - `TriggerRepo` — CRUD + the `list_matching_kv` / `list_matching_dead_letter` hot paths the dispatcher uses. Includes a `collection_matches` helper that handles `*`, `prefix:*`, and exact-name globs. - `OutboxRepo` — insert + claim-due + delete + reschedule. - `DeadLetterRepo` — insert + get + list + unresolved-count + resolve + GC. - `AbandonedRepo` — insert + GC. `TriggerConfig::from_env` (new module) follows the existing `SandboxCeiling` env-loading pattern for `PICLOUD_MAX_TRIGGER_DEPTH`, `PICLOUD_TRIGGER_RETRY_*`, `PICLOUD_DEAD_LETTER_RETENTION_DAYS`, and `PICLOUD_ABANDONED_EXECUTIONS_RETENTION_DAYS`. `Capability::AppManageTriggers(AppId)` and `AppDeadLetterManage(AppId)` join the enum. Both map onto the existing `Scope::AppAdmin` per the seven-scope commitment; `role_satisfies` grants them at the `AppAdmin` per-app role. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
72
crates/manager-core/migrations/0008_triggers.sql
Normal file
72
crates/manager-core/migrations/0008_triggers.sql
Normal file
@@ -0,0 +1,72 @@
|
||||
-- v1.1.1: Trigger framework — Layout E (design notes §2 + §7).
|
||||
--
|
||||
-- A parent `triggers` table holds the common columns (script_id, retry
|
||||
-- config, dispatch_mode, registered-by principal); per-kind detail
|
||||
-- tables hold the kind-specific filter columns. v1.1.1 ships two
|
||||
-- kinds: KV (collection_glob + ops) and dead_letter (source / trigger
|
||||
-- / script filters). Future kinds (cron, pubsub, queue, email) extend
|
||||
-- the parent and add their own detail table.
|
||||
--
|
||||
-- `registered_by_principal` captures the admin user that registered
|
||||
-- the trigger. The dispatcher resolves this back to a `Principal` at
|
||||
-- execution time so the trigger runs as the user that set it up
|
||||
-- (design notes §4: "a trigger execution runs as the principal that
|
||||
-- registered the trigger").
|
||||
--
|
||||
-- HTTP routes stay in their own `routes` table for now (Phase 3
|
||||
-- production schema with its own trie-index columns); the dispatcher
|
||||
-- discriminates HTTP outbox rows by `source_kind = 'http'` and
|
||||
-- `trigger_id` referencing `routes.id`. Folding routes into triggers
|
||||
-- is a v1.2 cleanup, not a v1.1.1 requirement.
|
||||
|
||||
CREATE TABLE triggers (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
app_id UUID NOT NULL REFERENCES apps(id) ON DELETE CASCADE,
|
||||
script_id UUID NOT NULL REFERENCES scripts(id) ON DELETE CASCADE,
|
||||
kind TEXT NOT NULL CHECK (kind IN ('kv', 'dead_letter')),
|
||||
enabled BOOLEAN NOT NULL DEFAULT TRUE,
|
||||
-- Async by default — sync would mean the trigger fires inline with
|
||||
-- the originating mutation, which v1.1.1 doesn't support.
|
||||
dispatch_mode TEXT NOT NULL DEFAULT 'async'
|
||||
CHECK (dispatch_mode IN ('sync', 'async')),
|
||||
-- Defaults applied at write time so the row is auditable on its
|
||||
-- own. Per-trigger overrides set on create; the env-defined
|
||||
-- defaults provide the fallback values.
|
||||
retry_max_attempts INT NOT NULL,
|
||||
retry_backoff TEXT NOT NULL
|
||||
CHECK (retry_backoff IN ('exponential', 'linear', 'constant')),
|
||||
retry_base_ms INT NOT NULL,
|
||||
registered_by_principal UUID NOT NULL REFERENCES admin_users(id) ON DELETE CASCADE,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
||||
);
|
||||
|
||||
-- The dispatcher's hot lookup: "all enabled triggers for app X of
|
||||
-- kind Y". Indexed only when enabled = TRUE so disabled rows don't
|
||||
-- pollute the index.
|
||||
CREATE INDEX idx_triggers_app_kind_enabled
|
||||
ON triggers (app_id, kind)
|
||||
WHERE enabled = TRUE;
|
||||
|
||||
-- One row per KV trigger. `collection_glob` accepts:
|
||||
-- "*" — any collection in the app
|
||||
-- "widgets" — exact match
|
||||
-- "users:*" — prefix wildcard (matched in Rust, not SQL)
|
||||
-- `ops` is the subset of {insert, update, delete} this trigger
|
||||
-- subscribes to. Empty array means "any op" (the trigger fires on
|
||||
-- every mutation; admin endpoint validates this).
|
||||
CREATE TABLE kv_trigger_details (
|
||||
trigger_id UUID PRIMARY KEY REFERENCES triggers(id) ON DELETE CASCADE,
|
||||
collection_glob TEXT NOT NULL,
|
||||
ops TEXT[] NOT NULL
|
||||
);
|
||||
|
||||
-- One row per dead-letter trigger. All three filter columns are
|
||||
-- nullable — NULL means "no filter on this dimension". A trigger
|
||||
-- with all three nullable filters fires on every dead-letter row.
|
||||
CREATE TABLE dead_letter_trigger_details (
|
||||
trigger_id UUID PRIMARY KEY REFERENCES triggers(id) ON DELETE CASCADE,
|
||||
source_filter TEXT,
|
||||
trigger_id_filter UUID,
|
||||
script_id_filter UUID
|
||||
);
|
||||
64
crates/manager-core/migrations/0009_outbox.sql
Normal file
64
crates/manager-core/migrations/0009_outbox.sql
Normal file
@@ -0,0 +1,64 @@
|
||||
-- v1.1.1: Universal trigger outbox — design notes §2.
|
||||
--
|
||||
-- One table for every async dispatch in the system. KV/cron/pubsub/
|
||||
-- queue/email/dead-letter all write rows in this shape; the dispatcher
|
||||
-- claims due rows with `FOR UPDATE SKIP LOCKED` and routes them to
|
||||
-- the executor.
|
||||
--
|
||||
-- Sync HTTP also writes here (NATS-style inbox, design notes §3) —
|
||||
-- `reply_to` carries an `inbox_id` that the orchestrator awaits on a
|
||||
-- oneshot channel. `reply_to.is_some()` is the "don't retry" signal:
|
||||
-- one attempt, surface the result via the inbox.
|
||||
--
|
||||
-- `trigger_id` is a polymorphic reference discriminated by
|
||||
-- `source_kind`: for `source_kind='http'` it references `routes.id`;
|
||||
-- otherwise it references `triggers.id`. Polymorphism handled in
|
||||
-- Rust (the dispatcher); no DB-level FK because Postgres doesn't
|
||||
-- support polymorphic FKs cleanly. NULL is allowed because direct
|
||||
-- admin-replay paths may not have a triggering row at all.
|
||||
--
|
||||
-- `script_id` denormalized so the dispatcher resolves the target
|
||||
-- script without an extra round-trip per row.
|
||||
|
||||
CREATE TABLE outbox (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
app_id UUID NOT NULL REFERENCES apps(id) ON DELETE CASCADE,
|
||||
source_kind TEXT NOT NULL
|
||||
CHECK (source_kind IN ('http', 'kv', 'dead_letter')),
|
||||
-- Polymorphic — see comment above. No FK constraint.
|
||||
trigger_id UUID,
|
||||
-- Pre-resolved at write time so the dispatcher doesn't re-look it up.
|
||||
script_id UUID,
|
||||
-- NULL = async (retry per policy). Some(inbox_id) = sync HTTP
|
||||
-- (never retry; resolve the inbox with the result).
|
||||
reply_to UUID,
|
||||
-- ServiceEvent + ExecRequest scaffold serialized as JSONB.
|
||||
payload JSONB NOT NULL,
|
||||
-- Forensic field — the principal that triggered the originating
|
||||
-- event. NOT the execution principal for trigger fan-out (that
|
||||
-- comes from `triggers.registered_by_principal`).
|
||||
origin_principal UUID,
|
||||
-- Trigger-depth as the dispatcher will hand it to the executor.
|
||||
-- Read out into ExecRequest.trigger_depth at dispatch time.
|
||||
trigger_depth INT NOT NULL DEFAULT 0,
|
||||
-- Originating execution id (for audit log grouping). Equals the
|
||||
-- root for direct invocations; preserved across fan-out chains.
|
||||
root_execution_id UUID,
|
||||
attempt_count INT NOT NULL DEFAULT 0,
|
||||
next_attempt_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
-- Set inside the SELECT FOR UPDATE SKIP LOCKED transaction so
|
||||
-- the dispatcher can't double-pick a row across concurrent loop
|
||||
-- iterations.
|
||||
claimed_at TIMESTAMPTZ,
|
||||
claimed_by TEXT,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
||||
);
|
||||
|
||||
-- Hot index: the dispatcher's `WHERE next_attempt_at <= NOW() AND
|
||||
-- claimed_at IS NULL` claim query. Partial index keeps the hot set
|
||||
-- small even if the table grows large.
|
||||
CREATE INDEX idx_outbox_due
|
||||
ON outbox (next_attempt_at)
|
||||
WHERE claimed_at IS NULL;
|
||||
|
||||
CREATE INDEX idx_outbox_app ON outbox (app_id);
|
||||
50
crates/manager-core/migrations/0010_dead_letters.sql
Normal file
50
crates/manager-core/migrations/0010_dead_letters.sql
Normal file
@@ -0,0 +1,50 @@
|
||||
-- v1.1.1: dead_letters — design notes §4.
|
||||
--
|
||||
-- Async invocations that exhaust their retry policy land here. Each
|
||||
-- row carries the original event payload verbatim plus the attempt
|
||||
-- history so handlers (registered via `dead_letter` triggers) and the
|
||||
-- dashboard can decide what to do.
|
||||
--
|
||||
-- Schema mirrors design notes §4. The CHECK constraint on
|
||||
-- `resolution` enforces the closed vocabulary used by both the SDK
|
||||
-- (`dead_letters::resolve(id, reason)`) and the recursion-stop rule
|
||||
-- (`handler_failed`). Sync HTTP failures (`reply_to.is_some()`) never
|
||||
-- land here — they're served via the inbox channel.
|
||||
--
|
||||
-- Indexes:
|
||||
-- - partial index on unresolved rows: the dashboard's
|
||||
-- unresolved-count badge query (`COUNT(*) WHERE app_id = $1 AND
|
||||
-- resolved_at IS NULL`).
|
||||
-- - GC index on `created_at`: the weekly retention sweep.
|
||||
|
||||
CREATE TABLE dead_letters (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
app_id UUID NOT NULL REFERENCES apps(id) ON DELETE CASCADE,
|
||||
-- The outbox.id row that exhausted retries. The outbox row itself
|
||||
-- has been deleted at this point.
|
||||
original_event_id UUID NOT NULL,
|
||||
source TEXT NOT NULL,
|
||||
op TEXT NOT NULL,
|
||||
-- Nullable because direct admin replays may have no trigger row.
|
||||
trigger_id UUID,
|
||||
script_id UUID,
|
||||
payload JSONB NOT NULL,
|
||||
attempt_count INT NOT NULL,
|
||||
first_attempt_at TIMESTAMPTZ NOT NULL,
|
||||
last_attempt_at TIMESTAMPTZ NOT NULL,
|
||||
last_error TEXT NOT NULL,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
resolved_at TIMESTAMPTZ,
|
||||
resolution TEXT
|
||||
CHECK (resolution IN
|
||||
('replayed', 'ignored', 'handled_by_script', 'handler_failed'))
|
||||
);
|
||||
|
||||
-- Dashboard unresolved-count badge — partial index on the predicate
|
||||
-- the query uses.
|
||||
CREATE INDEX idx_dead_letters_app_unresolved
|
||||
ON dead_letters (app_id)
|
||||
WHERE resolved_at IS NULL;
|
||||
|
||||
-- GC sweep scans by creation time.
|
||||
CREATE INDEX idx_dead_letters_gc ON dead_letters (created_at);
|
||||
31
crates/manager-core/migrations/0011_abandoned_executions.sql
Normal file
31
crates/manager-core/migrations/0011_abandoned_executions.sql
Normal file
@@ -0,0 +1,31 @@
|
||||
-- v1.1.1: abandoned_executions — design notes §3 #9.
|
||||
--
|
||||
-- Forensic table for the "dispatcher tried to resolve a oneshot inbox
|
||||
-- but the receiver was already dropped" edge case. The orchestrator
|
||||
-- timed out (returned 504 to the caller) and gave up on the channel,
|
||||
-- but then the dispatcher's execution succeeded later. The caller
|
||||
-- never sees the result; the row exists so the operator can
|
||||
-- correlate when the abandoned-counter metric spikes.
|
||||
--
|
||||
-- Only the dispatcher-after-orchestrator-timeout edge case writes
|
||||
-- here; ordinary "script timed out, caller got 504" stays uneventful.
|
||||
--
|
||||
-- 7-day retention, GC by `created_at`, sweep alongside dead_letters.
|
||||
|
||||
CREATE TABLE abandoned_executions (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
app_id UUID NOT NULL REFERENCES apps(id) ON DELETE CASCADE,
|
||||
-- Original outbox row id (the row itself has been deleted).
|
||||
outbox_id UUID NOT NULL,
|
||||
script_id UUID,
|
||||
-- The inbox channel id the dispatcher tried to resolve.
|
||||
inbox_id UUID NOT NULL,
|
||||
-- The HTTP status code the dispatcher attempted to send back.
|
||||
status_code INT NOT NULL,
|
||||
-- Truncated body / error description (capped at write time —
|
||||
-- the dispatcher doesn't need to ship megabytes here).
|
||||
result_summary TEXT,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
||||
);
|
||||
|
||||
CREATE INDEX idx_abandoned_executions_gc ON abandoned_executions (created_at);
|
||||
Reference in New Issue
Block a user