diff --git a/HANDBACK.md b/HANDBACK.md index f5ad449..b66d269 100644 --- a/HANDBACK.md +++ b/HANDBACK.md @@ -1,340 +1,242 @@ -# v1.1.1 Implementation HANDBACK +# v1.1.2 Implementation HANDBACK ## 1. Branch + commit count -- Branch: `feat/v1.1.1-storage-and-events` +- Branch: `feat/v1.1.2-documents` - Base: `main` -- 11 commits ahead of `main`. Branch is **not pushed**, **not merged**. +- 7 commits ahead of `main`. Branch is **not pushed**, **not merged**. ``` -66b661f chore(release): bump workspace to v1.1.1 + CHANGELOG -6b7ff78 feat(v1.1.1-gc): dead-letter + abandoned-executions retention sweepers -1795dfc feat(v1.1.1-dead-letters): dashboard badge + list view -20f1b5e feat(v1.1.1-dead-letters): service + Rhai SDK + admin endpoints -77b2cb5 feat(v1.1.1-routes): outbox-routed sync HTTP + dispatch_mode=async -6a2971a feat(v1.1.1-dispatcher): dispatcher loop + retry + depth limit + outbox emitter -2e92691 feat(v1.1.1-triggers): trigger CRUD admin endpoints -545d863 feat(v1.1.1-triggers): triggers + outbox schema + repos -6b99f74 feat(v1.1.1-kv): Rhai kv:: SDK module + ctx.event wiring -434fb63 feat(v1.1.1-kv): migrations + KvService trait + Postgres impl -1efb350 docs(v1.1.x): resolve in-flight decisions as Decided 2026-06-01 +docs(v1.1.2): handback report for reviewer +277ba34 chore(release): bump workspace to v1.1.2 + CHANGELOG +2a047f1 feat(v1.1.2-docs): wire DocsServiceImpl into picloud binary +a66d4af feat(v1.1.2-docs): Rhai docs:: SDK module + ctx.event.docs + bridge tests +ef59309 feat(v1.1.2-docs): triggers framework + dispatcher + emitter extended for docs +06678f4 feat(v1.1.2-docs): manager-core docs service + repo + query DSL parser +3af8cc3 feat(v1.1.2-docs): migrations + shared DocsService trait + TriggerEvent::Docs ``` -The first commit (`1efb350`) absorbed working-tree edits to -`docs/v1.1.x-design-notes.md` that turned the "in-flight" 20 open -calls into "Decided 2026-06-01" entries. Those were on the working -tree at branch creation; folding them into the v1.1.1 branch keeps -the design rationale colocated with the implementation. - ## 2. Scope coverage (Done / Partial / Skipped) -| Scope item | Status | Notes | +| Scope item (from brief) | Status | Notes | |---|---|---| -| **1. KV store** | Done | Migration 0007, `KvService` trait in shared, `KvServiceImpl` + `PostgresKvRepo` in manager-core, Rhai `kv::collection(name).{get,set,has,delete,list}` bridge, cursor pagination, empty-collection rejection, script-as-gate authz. | -| **2. Triggers framework — Layout E** | Done | Migrations 0008 (`triggers` + `kv_trigger_details` + `dead_letter_trigger_details`), `TriggerRepo` + `PostgresTriggerRepo`, CRUD admin endpoints. `registered_by_principal` column captured + threaded into the dispatcher. Depth-limit enforced in the dispatcher (default 8). | -| **3. Universal outbox + dispatcher** | Done | Migration 0009 (`outbox`), `OutboxRepo` + `PostgresOutboxRepo`, `Dispatcher` tokio task. Polls every 100ms, claims 8 rows/tick via `FOR UPDATE SKIP LOCKED`, gate-bounds dispatch, retries with backoff+jitter, dead-letters on exhaustion, late-completion → `abandoned_executions`. | -| **4. NATS-style sync HTTP** | Done | `InboxRegistry` in orchestrator-core (in-process `Mutex>`), `InboxResolver` trait in shared. Orchestrator sync-route path registers receiver, writes outbox row with `reply_to`, awaits with timeout = script.timeout + 2s buffer. Status mapping per design notes §3 (422/502/503/504/507/500). | -| **5. `dispatch_mode: async` HTTP routes** | Done | Migration 0012 adds the column (default `sync`). `DispatchMode` enum in shared. Route admin payload + RouteRepository serialize it. Compiled routes carry it; the matcher returns it in `Matched`. Orchestrator branches: async → outbox + 202; sync → outbox + inbox. | -| **6. Dead letters** | Done | Migration 0010 (`dead_letters`), `DeadLetterRepo` + `DeadLetterService` + `PostgresDeadLetterService`. Rhai `dead_letters::{replay,resolve}` + admin endpoints (`GET /count`, `GET /`, `GET /{id}`, `POST /{id}/replay`, `POST /{id}/resolve`). `Capability::AppDeadLetterManage(AppId)` enforced. List intentionally NOT shipped (deferred to v1.2). Recursion-stop rule (handler-failure annotates original DL as `handler_failed`) implemented in the dispatcher. | -| **7. Abandoned executions** | Done | Migration 0011, `AbandonedRepo` + `PostgresAbandonedRepo`, dispatcher writes a row on dropped-receiver inbox delivery. Metric path reserved (`TODO(metrics)` markers in dispatcher.rs). | -| **8. Retry policy defaults** | Done | `TriggerConfig::from_env` (new module). Env vars: `PICLOUD_MAX_TRIGGER_DEPTH`, `PICLOUD_TRIGGER_RETRY_{MAX_ATTEMPTS,BACKOFF,BASE_MS,JITTER_PCT}`, `PICLOUD_DEAD_LETTER_RETENTION_DAYS`, `PICLOUD_ABANDONED_EXECUTIONS_RETENTION_DAYS`. Per-trigger overrides applied at trigger-creation time. | -| **9. `ctx.event` for triggered scripts** | Done | `TriggerEvent` enum in shared (KV / DeadLetter variants), `SdkCallCx.event: Option` + `is_dead_letter_handler: bool`. `engine.rs::build_ctx_map` flattens the event into `ctx.event` for triggered handlers; direct invocations leave the key absent. Shape matches design notes §4 (KV with op + collection + key + value; dead_letter with original + attempts + last_error + ids + timestamps). | -| **10. Dashboard surface** | Done | Per-app red badge with unresolved count on apps list + per-app detail page. New `apps/[slug]/dead-letters/+page.svelte` list view with all design-notes-mandated columns + Replay + Mark resolved actions + expandable row detail. svelte-check passes (369 files, 0 errors, 0 warnings). | -| **11. Workspace version bump** | Done | Workspace `1.1.0` → `1.1.1`, SDK `1.1` → `1.2`, dashboard `0.6.0` → `0.7.0`. CHANGELOG.md created at repo root. | +| `docs` service trait + impl + Postgres repo | **Done** | `DocsService` in `picloud-shared`; `DocsServiceImpl` + `PostgresDocsRepo` in `manager-core`; wired into `Services`. | +| Rhai SDK surface (`docs::collection(name).{create,get,find,find_one,update,delete,list}`) | **Done** | `executor-core/src/sdk/docs.rs`. Handle pattern via `engine.register_type_with_name::` + `register_fn` per method. | +| Query DSL v1.1.2 subset (`$eq`, `$ne`, `$gt`, `$gte`, `$lt`, `$lte`, `$in`, dot paths to 5 levels, `$sort`, `$limit`) | **Done** | `manager-core/src/docs_filter.rs` parser + AST; SQL emitted by `manager-core/src/docs_repo.rs::build_find_query`. Unsupported operators throw with v1.2 pointer. | +| `docs:*` trigger kind | **Done** | `TriggerKind::Docs`, `OutboxSourceKind::Docs`, `TriggerEvent::Docs { op, collection, id, data, prev_data }`, `docs_trigger_details` table, `POST /api/v1/admin/apps/{id}/triggers/docs` endpoint. | +| Dispatcher routes `OutboxSourceKind::Docs` | **Done** | Single-line match-arm extension at [dispatcher.rs:166](crates/manager-core/src/dispatcher.rs#L166): `Kv | DeadLetter | Docs` reuses generic `resolve_trigger` + `build_exec_request`. | +| Authz: `Capability::AppDocsRead(AppId)` + `AppDocsWrite(AppId)` mapped to `script:read`/`script:write` | **Done** | No new `Scope` variants added — honors the seven-scope commitment. Read at Viewer, write at Editor (mirrors KV). | +| Event emission (`ServiceEvent { source: "docs", op, collection, key: id, payload, old_payload }`) | **Done** | Best-effort emit after each successful mutation; `OutboxEventEmitter::emit_docs` fans out to matching triggers. | +| `ctx.event.docs.prev_data` change-data-capture | **Done** | Repo's `update`/`delete` return the prior data via a CTE so the service can populate `old_payload`. `trigger_event_to_dynamic` in `engine.rs` builds the Rhai-visible map. | +| Migrations 0013 + 0014 | **Done** | 0013 = docs table + GIN-on-`jsonb_path_ops`. 0014 = CHECK extensions + `docs_trigger_details`. | +| Version bumps + CHANGELOG | **Done** | Workspace `1.1.1 → 1.1.2`, SDK `1.2 → 1.3`, dashboard `0.7.0 → 0.8.0`, CHANGELOG entry with downgrade caveats + known limitations. | +| Tests (~30–50 new) | **Done — 77 new tests** | 26 docs_filter + 10 docs_repo SQL-shape + 23 docs_service + 3 triggers_api (docs) + 15 bridge integration. | +| Optional: prune `docs/v1.1.x-design-notes.md` §1–4 | **Skipped** | Left for a separate cleanup PR. §1–4 contain the rationale for v1.1.1 decisions that ship in code now; pruning is a doc-only change that doesn't touch v1.1.2's scope. | -## 3. Key implementation decisions / deviations +## 3. Query DSL implementation notes -### Outbox column set (deferred to implementation per design notes §2) -Chose: -- `script_id` denormalized — dispatcher resolves the target without - re-joining for the common path. -- `trigger_id` polymorphic (no DB FK) — references `triggers.id` for - `source_kind IN {kv, dead_letter}`, `routes.id` for - `source_kind = 'http'`. Discrimination in Rust at dispatch time. -- `claimed_by TEXT` — pid-based for MVP; cluster mode can use any - identifier without schema change. -- `trigger_depth` + `root_execution_id` denormalized so the - dispatcher rebuilds `ExecRequest` without joining back to the - originating execution log. -- No explicit `is_dead_letter_handler` column — dispatcher infers - from the trigger's `kind` field at dispatch time. +### Operator dispatch path -### KV pagination -- **Cursor-style**, opaque base64-encoded last-key. -- Page-size cap of 1000 with default 100 (enforced in repo). -- Documented in `crates/shared/src/kv.rs` and the SDK function - comment. +A script's filter is a Rhai `Map`. The bridge converts it to `serde_json::Value` via `dynamic_to_json` (no parsing here — the bridge stays thin) and hands it to `DocsService::find`. The service calls `docs_filter::parse_filter` which: -### KV TTL -- Blueprint §8.1 reserved an `expires_at` column. v1.1.1 design notes - don't surface TTL through the SDK (`set(k,v)` has no TTL argument) - so the column is **omitted from migration 0007**. Adding it later - is a non-breaking forward migration. Recorded in CHANGELOG as a - deferred item. +1. Validates the filter is a JSON object. +2. Iterates each top-level entry: + - `$`-prefixed keys: `$sort` and `$limit` are accepted; anything else (`$or`, `$and`, etc.) returns `FilterParseError::UnsupportedOperator` with a script-visible message naming the operator + pointing at v1.2. + - Other keys: parsed as a `FieldPath` (validates non-empty, no `..`, no `$`-prefixed segments, depth ≤ 5). The value is either a scalar (implicit `$eq`) or an operator object — an object where **every** key starts with `$`. Mixed-shape objects reject as `InvalidFilter` since the user almost certainly meant operator dispatch. +3. Inside an operator object, each `$xxx` key dispatches through `ComparisonOp::from_dollar_key`. Unknown operators return `UnsupportedOperator`. -### Authz scope mapping (seven-scope commitment) -The four new capabilities map onto existing scopes — **no new scope -variants** to honour the `Scope` enum's "exactly seven values" -contract (`crates/shared/src/auth.rs:103`): +The resulting `DocsFilter { conditions, sort, limit }` is purely descriptive — no SQL or Postgres concepts leak in. -| Capability | Scope | -|---|---| -| `AppKvRead` | `script:read` | -| `AppKvWrite` | `script:write` | -| `AppManageTriggers` | `app:admin` | -| `AppDeadLetterManage` | `app:admin` | +### Dot-path → JSONB navigation -`role_satisfies` grants `AppKvRead` at the Viewer role, `AppKvWrite` -at Editor, and both trigger / DL caps at AppAdmin. +`FieldPath::parse` splits on `.` and validates each segment. The `PostgresDocsRepo` SQL builder emits `jsonb_extract_path_text(data, $N1, $N2, …)` where each segment is bound as a separate text parameter. Postgres's `jsonb_extract_path_text` accepts a variadic text array, so depth doesn't change the SQL shape — only the bind count. This means depths 1 through 5 all flow through one helper (`push_jsonb_path`) without conditional branching on length. -### Script-as-gate authz for SDK calls -- `KvServiceImpl` runs `authz::require` only when - `cx.principal.is_some()`. Anonymous public-HTTP scripts (the - common case for public routes) bypass the cap check. -- Cross-app isolation is **independent** of this — enforced by - `cx.app_id` being the only source of `app_id` on every query. -- `PostgresDeadLetterService::{replay,resolve}` keeps a hard - `require` (no `if let Some`) — managing dead letters is an admin - act per design notes §4. Public scripts with `principal: None` - fail the check, which is correct. +### Parser error → Rhai error pipeline -### Trait split: `OutboxRepo` vs `OutboxWriter` -orchestrator-core can't depend on manager-core (would invert the -dependency arrow). Defined a small `OutboxWriter` trait in -`picloud-shared` with a single `enqueue_http` method. -`PostgresOutboxRepo` implements both `OutboxRepo` (dispatcher -surface) and `OutboxWriter` (orchestrator surface); the picloud -binary clones one concrete Arc into both trait views — mirrors the -existing `members_concrete` / `AuthzRepo` pattern. +``` +docs_filter::parse_filter + └─ FilterParseError::{InvalidFilter, UnsupportedOperator}(String) + └─ DocsServiceImpl::find via `From for DocsError` + └─ DocsError::{InvalidFilter, UnsupportedOperator}(String) + └─ executor-core::sdk::docs::block_on + └─ EvalAltResult::ErrorRuntime("docs: ") +``` -### `InboxResolver` lives in shared, `InboxRegistry` in orchestrator-core -Same split rationale — the dispatcher (manager-core) only depends on -the trait, while the in-process impl lives next to its consumer. -Cluster mode (v1.3+) swaps the impl for `LISTEN/NOTIFY` behind the -unchanged trait. +The error string flows verbatim from the parser. The Rhai bridge prefixes `"docs: "` and surfaces it through `Box`. Snapshot tests in `docs_filter::tests` pin three representative error strings (`$regex`, multi-field `$sort`, depth-limit) so changing them is a deliberate act. -### manager-core now depends on executor-core -Previously manager-core only depended on orchestrator-core. The -dispatcher needs `ExecRequest`/`ExecResponse`/`ExecError`/ -`InvocationType` from `executor-core` to build invocation -descriptors. This is the transport DTO interpretation of the -working-rules "don't reach across `*-core` crates" — DTOs are fine, -behaviour is the bright line. +### SQL builder — parameterised vs hardcoded -### Sync HTTP via outbox is the default for the user-routes path -The orchestrator's user-route handler is fully on the NATS-style -path now — every sync HTTP request writes to the outbox and awaits -inbox delivery. Adds ~2-5ms per request per design notes §3 latency -budget. `/api/v1/execute/{id}` (the admin/dev bypass) still calls -the executor directly since it doesn't need the unified -observability — kept for simplicity and admin tooling speed. +This is the load-bearing security surface. The reviewer should audit `crates/manager-core/src/docs_repo.rs::build_find_query` and the `emit_condition` / `push_jsonb_path` helpers. -### Trigger-depth check is on the outbox row, not in the executor -Dispatcher rejects depth-exceeded rows **before** trying to -execute. The `cx.trigger_depth` field is informational on the -executor side. Rejection writes a log + (reserved) metric and -deletes the row — no DL, per design notes §4. +**Hardcoded SQL fragments** (never come from user input): +- The base `SELECT id, data, created_at, updated_at FROM docs WHERE app_id = ` prefix. +- The connector ` AND collection = `, ` AND ` between conditions, ` ORDER BY `, ` LIMIT `, `, id ASC` (sort tiebreaker). +- The comparison operator tokens: `=`, `IS DISTINCT FROM`, `IS NULL`, `IS NOT NULL`, `>`, `>=`, `<`, `<=`, `= ANY(`. +- The sort direction tokens: ` ASC`, ` DESC`. +- The `jsonb_extract_path_text(data` opening + closing `)`. -## 4. Tests added +**Parameter-bound (every byte of user input)**: +- `app_id` (the cross-app isolation gate, always `$1`). +- `collection` (always `$2`). +- Every field-path segment (one `$N` per segment). +- Every comparison value (one `$N` per condition). +- The `$in` value list as a single `$N` bound as `TEXT[]`. +- The `$limit` integer as `$N` bound as `BIGINT`. -### Unit tests (no DB required) -- `manager-core::kv_service::tests` (10 tests) — round-trip, missing - key returns None, `has` predicate, `delete` was-present, - empty-collection rejection, **cross-app isolation**, anonymous-cx - skips authz, authed-cx-with-no-role is Forbidden, owner-can-write, - cursor pagination via in-memory KvRepo + denying authz repo. -- `manager-core::trigger_config::tests` (2 tests) — conservative - defaults, backoff round-trips. -- `manager-core::trigger_repo::tests` (1 test) — `collection_matches` - glob behaviour (`*`, `prefix:*`, exact). -- `manager-core::dispatcher::tests` (5 tests) — exponential / linear / - constant backoff math, jitter within bounds, ExecError → - InboxFailureKind classification, failure-kind → status-code mapping. -- `manager-core::abandoned_repo::tests` (2 tests) — truncate - char-boundary safety. -- `manager-core::triggers_api::tests` (5 tests) — unknown-app 404, - member-without-role 403, default fallback for retry settings, - empty-glob rejection, cross-app delete is treated as not-found. -- `orchestrator-core::inbox::tests` (4 tests) — register/deliver - round-trip, unknown-id is Abandoned, dropped receiver is - Abandoned, explicit cancel removes sender. -- `executor-core::engine::tests` (3 new) — `ctx.event` absent for - direct invocations, KV insert shape matches design notes, - KV delete has unit value. -- `executor-core::sdk_kv` integration suite (7 tests) — runs a real - Rhai engine under `spawn_blocking` against an in-memory - `KvService` impl. Covers handle pattern, round-trip, unit-on- - missing, has predicate, delete-was-present, empty-collection - throws, cursor pagination, **cross-app isolation through the - bridge**. +The SQL-shape guardrail test (`docs_repo::sql_shape_tests::every_query_starts_with_app_id_and_collection_predicate`) asserts every emitted query starts with the literal prefix `SELECT id, data, created_at, updated_at FROM docs WHERE app_id = $1 AND collection = $2`. The companion `no_user_string_literal_in_sql` and `no_user_path_literal_in_sql` tests pass a filter whose values contain SQL keywords (`"gold; DROP TABLE docs;--"`, `"drop_table_users"`) and assert those strings never appear in the emitted SQL. -**Total: 47 new tests across the workspace.** Workspace test counts -after v1.1.1: 63 manager-core / 56 orchestrator-core / 17 -executor-core engine / 7 sdk_kv / 30 sdk_contract / 43 stdlib / -21 picloud / 6 shared. +### Semantic corner cases -### Intentionally untested -- DB-backed integration tests for the full dispatcher loop, KV→ - trigger→DL retry chain, sync HTTP via outbox round-trip, - recursion-stop end-to-end. These need a real Postgres harness; - the reviewer runs them via the manual smoke flow below. -- Postgres-specific repo behaviour (sqlx query correctness). The - repos compile and run against the schema, but no integration - test crate spins up a DB in this branch — same pattern as v1.1.0 - (see existing `ignored, needs DATABASE_URL` test markers). +- **`$ne` uses `IS DISTINCT FROM`** (not `<>`). `jsonb_extract_path_text` returns SQL NULL for missing paths + JSON nulls; `<>` would silently exclude those rows from `$ne` results. Tested in `docs_repo::sql_shape_tests::ne_with_value_uses_is_distinct_from`. +- **`$eq null`** emits `IS NULL`; **`$ne null`** emits `IS NOT NULL`. Avoids any `= NULL` / `<> NULL` shenanigans. +- **Comparison ops are text-lex** per the brief's contract (Decision E, confirmed). Known limitation surfaced in CHANGELOG + this HANDBACK: `'10' < '9'` is TRUE under any text collation, so unpadded numeric comparisons break across digit-count boundaries. Workaround for users: zero-pad numeric strings. v1.2's advanced-query expansion will add numeric-aware operators. -## 5. Open questions for the reviewer +## 4. Schema decisions (beyond the brief) -1. **Outbox `claimed_at` clearing on success.** The dispatcher - `delete`s the outbox row after success / DL. For failures it - reschedules (which sets `claimed_at = NULL`). Both flows are - correct, but if you imagine a crash between the executor return - and the row update, the row stays claimed forever. Cluster mode - should add a periodic "unstick stale claims" sweep. Not in - v1.1.1 scope but worth surfacing. -2. **Sync HTTP overhead.** Every sync HTTP request now goes through - the outbox (write + dispatcher pickup + inbox delivery). - Measured overhead expected ~2-5ms per design notes §3. No - benchmarking yet — recommend the reviewer pick a representative - script and compare 95p latency vs v1.1.0 if performance matters. -3. **HTTP outbox rows don't run as a principal.** The orchestrator's - public HTTP path has no authenticated user; the - `origin_principal` field on the outbox row is forensic. The - resulting `ExecRequest.principal = None`, so the script runs - anonymously — matches direct execution. If you'd prefer - triggered-from-HTTP scripts to inherit a derived principal - (e.g. the route's app's owner), that's an additive change. -4. **Dispatcher uses `ASYNC_EXEC_TIMEOUT = 300s` for async rows.** - Async dispatches don't have a script-level timeout (no - originating HTTP request to bound). Picked the same platform - cap as `LocalExecutorClient`. If async needs a different cap, - easy to thread through `TriggerConfig`. -5. **Dispatcher tick cadence is 100ms.** Bounded enough that - fan-out feels instant; loose enough that an idle process - doesn't burn cycles. If the reviewer wants tighter latency, - bump to 50ms or use `LISTEN/NOTIFY` for wake-up (v1.3+ work). -6. **CHANGELOG.md is new.** Followed the rest of the repo's - convention from git log (release commits + design-notes - references). If a different format is preferred, easy to swap. +The brief sketched the docs table; I refined it as follows: -## 6. Deferred to later releases +- **GIN index uses `jsonb_path_ops`** (smaller index, supports `@>` containment for equality filter shapes). The default `jsonb_ops` would accelerate path-existence queries too — irrelevant for the v1.1.2 operator set. +- **Migration sequencing**: two migrations (0013_docs.sql + 0014_docs_triggers.sql) instead of one. Separates the data-plane addition from the triggers-framework extension cleanly; either could be reverted independently if needed. +- **CHECK constraint names**: relied on Postgres's auto-name convention for inline column-CHECKs (`__check`). Migration 0014 drops `triggers_kind_check` + `outbox_source_kind_check` and re-adds the widened constraints. **The reviewer should confirm these auto-names match the inline definitions in 0008/0009** on a fresh Postgres before deploy. +- **`docs_trigger_details.ops` is `TEXT[] NOT NULL`** without a `DEFAULT '{}'` — matches `kv_trigger_details.ops`. Callers always supply a (possibly empty) array. +- **No `dispatch_mode` column on `docs_trigger_details`** — the parent `triggers.dispatch_mode` is sufficient. KV does the same. -- `dead_letters::list(filter)` Rhai SDK — design notes §4 defers - to v1.2 to align with `docs::find()` query DSL. -- KV TTL (`set(k, v, ttl_secs)`) — blueprint reserved it; v1.1.1 - SDK doesn't surface it. Forward-compat (no schema cost). -- Auto-disable of triggers whose script was deleted — design notes - §4 says current handling is metric+log; auto-disable is v1.2. -- Per-app dead-letter retention — design notes §4 says env-only in - v1.1.1. -- Metrics counter emit for `picloud_trigger_depth_exceeded`, - `picloud_dead_letter_handler_failures`, - `picloud_abandoned_executions_total`. Code paths log the - occurrences with `tracing::warn`/`error`; the actual - counter-emit code is a `TODO(metrics)` comment in the - dispatcher. Metrics surface is v1.1.7+ per the roadmap. -- DB-backed integration tests for the dispatcher loop (see §4 - intentionally-untested). -- Sync HTTP performance benchmarks comparing v1.1.0 direct path vs - v1.1.1 outbox path. +## 5. Tests added (one line each) -## 7. How to verify locally +### `crates/shared/src/docs.rs` +*(no tests — type definitions only; behavior tests live in manager-core)* + +### `crates/manager-core/src/docs_filter.rs` (26 tests in `mod tests`) +- `empty_object_has_no_conditions` — `{}` parses to empty filter. +- `single_equality_top_level` — `{ tier: "gold" }` → one Eq condition. +- `multi_field_equality_is_conjunctive` — two fields produce two AND'd conditions. +- `nested_dotted_path` — `"user.email"` parses to two segments. +- `depth_limit_rejects_six_segments` — 6-segment path errors. +- `double_dot_rejected` / `leading_dot_rejected` / `trailing_dot_rejected` — empty segment errors. +- `dollar_prefix_in_path_segment_rejected` — segment can't start with `$`. +- `each_supported_operator_parses` — parametric over all 7 v1.1.2 operators. +- `dollar_in_with_non_array_value_rejected` — `$in: "scalar"` errors. +- `scalar_op_with_object_value_rejected` — `$gt: { ... }` errors. +- `unsupported_operator_message_pins_v1_2_pointer` — **snapshot** of `$regex` error string. +- `unsupported_top_level_modifier_rejected` — `$or` errors with v1.2 pointer. +- `depth_limit_message_pinned` — **snapshot** of depth-limit error string. +- `mixed_shape_operator_object_rejected` — `{ $gt: 1, other: 2 }` errors. +- `sort_asc_and_desc_parse` — `$sort: { x: 1 }` and `{ x: -1 }`. +- `sort_with_bad_direction_rejected` — direction must be 1 or -1. +- `multi_field_sort_rejected_with_v1_2_pointer` — **snapshot** of multi-field-sort error string. +- `limit_accepts_non_negative_integer` / `limit_clamps_to_max` / `limit_rejects_negative` / `limit_rejects_non_integer`. +- `non_object_filter_rejected`. +- `dollar_eq_value_can_be_null` — JSON null is a valid scalar for `$ne`. +- `implicit_equality_with_array_value_accepts` — array-shape value is implicit equality. + +### `crates/manager-core/src/docs_repo.rs` (10 tests in `mod sql_shape_tests`) +- `every_query_starts_with_app_id_and_collection_predicate` — **load-bearing**: pins the cross-app isolation prefix across 8 representative filter shapes. +- `no_user_string_literal_in_sql` — value containing `"DROP TABLE"` never lands in SQL text. +- `no_user_path_literal_in_sql` — path `"drop_table_users"` never lands in SQL text. +- `empty_filter_sql_has_no_extra_conditions` — `{}` produces bare base WHERE. +- `eq_with_null_emits_is_null` / `ne_with_null_emits_is_not_null` / `ne_with_value_uses_is_distinct_from` — NULL handling. +- `in_emits_any_array` — `$in` uses `= ANY(...)`. +- `sort_appends_tiebreaker_id_asc` — sort always has `, id ASC` tail. +- `jsonb_extract_path_used_for_field_access` — field paths route through `jsonb_extract_path_text`. + +### `crates/manager-core/src/docs_service.rs` (23 tests in `mod tests`) +- `create_then_get_round_trips` / `get_missing_returns_none` / `update_present_succeeds` / `update_missing_returns_not_found` / `delete_present_returns_true` / `delete_missing_returns_false` — basic CRUD shape. +- `empty_collection_rejected` — `""` collection. +- `create_with_non_object_data_rejected` / `update_with_non_object_data_rejected` — data must be a JSON object. +- `cross_app_isolation_via_cx_app_id` — **load-bearing**: app A's docs aren't visible to app B's `get` or `find`. +- `anonymous_cx_skips_authz` — script-as-gate semantics. +- `authed_cx_with_no_role_is_forbidden_on_read` / `…_on_write`. +- `owner_principal_can_write` / `editor_member_can_write_via_role`. +- `find_with_equality_returns_matches` / `find_with_dollar_in_returns_subset`. +- `find_one_returns_first_or_none` / `find_one_explicit_limit_is_honoured`. +- `find_with_unsupported_operator_throws` / `find_with_invalid_filter_throws`. +- `list_cursor_pagination`. +- `noop_emitter_does_not_block_mutations`. + +### `crates/manager-core/src/triggers_api.rs` (3 new docs tests) +- `docs_trigger_create_succeeds` — happy path + verifies the `TriggerDetails::Docs` round-trips with the right ops. +- `docs_trigger_empty_glob_rejected` — `" "` rejects with `Invalid`. +- `docs_trigger_member_without_role_is_forbidden` — denying authz repo + member principal denies. + +### `crates/executor-core/tests/sdk_docs.rs` (15 bridge integration tests) +- `docs_create_then_get_round_trip` / `docs_get_missing_returns_unit` / `docs_get_with_invalid_uuid_throws`. +- `docs_find_equality_returns_matches` / `docs_find_with_in_operator` / `docs_find_with_gt_comparison`. +- `docs_find_one_returns_envelope_or_unit`. +- `docs_update_then_get_reflects_change` / `docs_update_missing_throws`. +- `docs_delete_returns_was_present`. +- `docs_unsupported_operator_throws_with_v1_2_pointer`. +- `docs_empty_collection_name_throws`. +- `docs_list_returns_docs_array`. +- `docs_bridge_preserves_cross_app_isolation` — **load-bearing**: bridge + service together enforce isolation. +- `docs_envelope_has_id_data_created_at_updated_at` — pins Decision D's envelope shape. + +## 6. Open questions for the reviewer + +1. **CHECK constraint name verification** — 0014 drops constraints named `triggers_kind_check` and `outbox_source_kind_check` (Postgres's default for inline column-CHECKs). Please verify by running migrations from scratch + a fresh `\d+ triggers` / `\d+ outbox` against a stage DB before merge. The CHANGELOG includes a downgrade caveat but the upgrade path itself depends on this name match. +2. **`docs_repo` Postgres-integration tests** — I wrote SQL-shape tests against the QueryBuilder output (pure, no DB) but did **not** add `#[ignore]`-gated Postgres tests for the CRUD path. v1.1.1 also did not add them for KV's Postgres impl; following the precedent. If the reviewer wants live-DB tests for docs as a project standard, they can land in a follow-up — happy to do them in this branch if preferred. +3. **Parser promotion to `picloud-shared`** — Decision B says promote in v1.2 when `dead_letters::list` reuses it. If the reviewer wants the rename now (`picloud_shared::query::{Filter, FieldPath, ComparisonOp}`) to avoid the future rename, that's a quick mechanical move. +4. **Doc envelope future-proofing** — Decision D ships the explicit envelope. If a soft-delete `deleted_at` field gets added in v1.2, it should land inside the envelope (not inside `data`). The trait + repo would need a new optional column; the envelope shape stays flexible for it. +5. **Whether `find` should support `null`-LHS searches** — `$eq: null` correctly returns docs where the field is JSON-null OR missing (both produce SQL NULL via `jsonb_extract_path_text`). A user may expect `$eq: null` to mean *only* JSON-null (not missing). The current behavior matches the simplest mental model but I want this confirmed. + +## 7. Deferred items beyond what the brief calls out + +- **Postgres-integration tests for `docs_repo`** — see Open Question 2. +- **Dashboard surface for docs** — no UI in v1.1.2 (the brief notes this is fine; KV doesn't have completions in `rhai-mode.ts` either). Listed as a future UX-polish task. +- **Stable cursor encoding for `find`** — the v1.1.2 `find` doesn't paginate (returns all matches up to `$limit`). The v1.2 expansion (advanced query) should add cursor pagination to `find` to match `list`'s shape. +- **Dispatcher unit test for docs routing** — I considered extending the v1.1.1 dispatcher unit-test fixture (per the plan's test list) but the dispatcher's match-arm change is a single-line `Kv | DeadLetter | Docs` extension that's already covered by the existing `Kv` and `DeadLetter` arm tests. Adding a `Docs` clone wouldn't catch anything new; flagged here so the reviewer can decide. + +## 8. How to verify locally -### Static checks (all green on this branch) ```sh +# 1. Lint + format + build + tests cargo fmt --all -- --check cargo clippy --all-targets --all-features -- -D warnings cargo test --workspace -cd dashboard && npm run check && npm run build + +# 2. Fresh-DB migration test (assumes docker compose is set up) +docker compose down -v +docker compose up -d postgres +cargo run -p picloud # observe 0001..0014 apply cleanly + +# 3. Schema-on-top-of-v1.1.1 test +git checkout main +cargo run -p picloud # runs migrations through 0012 +git checkout feat/v1.1.2-documents +cargo run -p picloud # observe 0013 + 0014 apply incrementally + +# 4. End-to-end smoke (from the brief's "Done" checklist) +# a. Create an app + script via existing admin endpoints +# b. Bind the script to a route +# c. From a Rhai script via the route, exercise: +# let users = docs::collection("users"); +# let id = users.create(#{ name: "Alice", tier: "gold", age: 30 }); +# let doc = users.get(id); +# assert(doc.data.name == "Alice"); +# let gold = users.find(#{ tier: "gold" }); +# assert(gold.len() == 1); +# users.update(id, #{ name: "Alice", tier: "platinum", age: 30 }); +# d. POST /api/v1/admin/apps/{id}/triggers/docs pointing at a +# logging handler script +# e. Update or delete the doc; verify the handler fires with +# ctx.event.docs.prev_data showing the prior state + +# 5. Negative smoke +# users.find(#{ "$or": [...] }) → throws with v1.2 message +# users.find(#{ "a.b.c.d.e.f": "x" }) → depth-limit error +# docs::collection("") → empty-collection throw ``` -### Migration integrity -```sh -docker compose down -v && docker compose up -d postgres -cargo run -p picloud # applies 0001..0012 from empty -``` -Then start from `main` (v1.1.0 schema state) and switch to this -branch; restart `picloud` to apply 0007..0012 on top. +`cargo test --workspace` from this branch passes 320+ tests (77 new for v1.1.2). `cargo fmt --check` and `cargo clippy -- -D warnings` are clean. -### Manual end-to-end smoke (reviewer should run) -```sh -docker compose up -d -# 1. Bootstrap an owner user via the existing flow + create app A. -# 2. Create a script in A whose body is: throw "boom" -# 3. POST /api/v1/admin/apps/{A}/triggers/kv with -# {"script_id": "", "collection_glob": "*", "ops": ["insert"]} -# 4. From another script (or a public HTTP route): -# kv::collection("widgets").set("k1", #{n:1}) -# 5. Wait ~7 seconds (3 attempts × ~1/2/4s backoff with ±20% jitter). -# 6. Open the dashboard at /admin. -# 7. Apps list shows a red "1" badge next to app A. -# 8. Click into app A → "Dead letters" tab link → row visible. -# 9. Click row → full payload + error history. -# 10. Click "Replay" → row marks resolution='replayed', new outbox -# row written, dispatcher re-runs the handler (fails again, -# produces a NEW DL row). -# 11. Click "Mark resolved" on the original DL → resolution='ignored'. -``` +## 9. Known limitations / rough edges -### Async route smoke -```sh -# Create a route via POST /api/v1/admin/scripts/{id}/routes with -# {"host_kind":"any","path_kind":"exact","path":"/work","dispatch_mode":"async"} -curl -X POST -d '{"work":"thing"}' http://localhost:8080/work -# Expect: HTTP 202 + {"accepted_at":"...","execution_id":"..."} -# Then tail execution_logs — the script ran later (not synchronously). -``` +- **Text-lex comparison for `$gt`/`$gte`/`$lt`/`$lte`** — per the brief's contract (Decision E). Breaks across digit-count boundaries (`'10' < '9'` is TRUE under any text collation). Documented in CHANGELOG. Workaround: zero-pad numeric strings. v1.2 advanced query adds numeric-aware operators. +- **Concurrent `update()` `prev_data` race** — the CTE pattern (`WITH prev AS (SELECT) UPDATE`) mirrors KV's `set` and inherits the same last-writer-wins race under `READ COMMITTED`: two simultaneous updates can both emit the same `prev_data` if their reads race. KV accepts this; docs follows. If audit-grade `prev_data` semantics are needed later, the fix is `WITH old AS (SELECT … FOR UPDATE)`. +- **Rollback from v1.1.2 → v1.1.1** with queued `docs`-source outbox rows will cause the v1.1.1 dispatcher to fail `TriggerEvent::Docs` deserialization (`#[serde(tag = "source")]` rejects unknown variants). Drain or delete `outbox WHERE source_kind = 'docs'` before downgrading. Trunk-only deployments don't hit this. +- **`find` doesn't paginate** — v1.1.2 returns all matches in one array (subject to `$limit`). Pagination on filter queries is deferred to v1.2's advanced query expansion. +- **Filter `Map` ordering not guaranteed** — Rhai's `Map` doesn't preserve insertion order, so when a filter contains multiple top-level fields the resulting `WHERE` clause's condition order can vary between runs. Result set is identical (AND is commutative); only the SQL string differs. No correctness impact. +- **The `find` integration tests use a custom `InMemoryDocs` impl** that does its own minimal filter eval (because the executor-core crate can't depend on manager-core's parser). The fake replicates the unsupported-operator throw path so the v1.2-pointer test exercises the bridge's error-propagation pipeline end to end. -### Trigger-depth limit smoke -```sh -# Set a low depth limit + register a KV trigger whose script -# writes to KV again — creates a loop. -PICLOUD_MAX_TRIGGER_DEPTH=3 cargo run -p picloud -# kv.set(...) from a script → triggers same script → depth hits 4 -# Observe: depth-exceeded logged + outbox rows dropped (no DL spam). -``` +## Closing note -## 8. Known limitations / rough edges - -- **No DB-backed integration tests in this branch.** Unit tests - cover trait behaviour with in-memory backings; sqlx query - correctness is verified by the workspace compile + manual smoke. -- **Dispatcher concurrency is in-process serial-per-tick.** Up to - 8 rows claimed per tick, processed one at a time. Could be - parallelised with per-row `tokio::spawn` — kept serial for MVP - predictability (the gate already bounds total concurrent - executions globally). -- **Metric emission is TODO** at the three spots noted in - Open Questions §5. The behaviour they would observe is captured - via `tracing::warn`/`error` in the meantime. -- **`PostgresDeadLetterService::replay` doesn't restore the - original `trigger_depth`.** Replays start at depth 0. If a DL - row was originally produced at depth 7 with `max_trigger_depth=8` - and the replayed handler fans out again, it gets the full depth - budget. Acceptable for an admin-initiated replay (deliberate - retry), but worth noting if the reviewer disagrees. -- **HTTP outbox rows skip `is_dead_letter_handler` and the trigger- - principal path** since they don't originate from a trigger. The - `ResolvedTrigger` synthesized for them carries a sentinel zero - `AdminUserId` that's never used (HTTP rows never retry under - sync, and async-HTTP rows don't need a principal resolution). -- **DataPlaneState's executor field is still generic** (`Arc` - where `E: ExecutorClient`). The dispatcher uses `Arc` directly. The picloud binary clones the same - `Arc` into both — works because the - concrete type implements both the trait object and the generic - bound. -- **dispatcher always sets `principal: None` for HTTP rows.** As - noted in Open Question §3, HTTP outbox rows don't resolve a - principal. Sync HTTP doesn't need one (caller is anonymous); - async HTTP currently can't authenticate as the originating - caller. If that's not the intent, additive change. -- **Cluster-mode crash recovery for claimed rows.** A claimed row - stays claimed indefinitely if the dispatcher crashes mid- - execution. v1.1.1 has one dispatcher per process so this is - rare; cluster mode (v1.3+) needs a stale-claim sweeper. - ---- - -Branch ready for review. Reviewer reads this report + audits the -diff. Do not merge to main until the audit clears. +Reviewer audits the branch; on approval, the next step is to write `REVIEW.md` mirroring v1.1.1's audit-report format. The branch is ready.