docs(v1.1.1): reviewer audit report — APPROVE verdict

Independent audit of feat/v1.1.1-storage-and-events against the
design notes §1–4 (Decided 2026-06-01) and the original dispatch
prompt. Static checks reproduce green; 243-test workspace suite
passes; schema + dispatcher + inbox conform to the design notes
end-to-end. Nine HANDBACK-flagged deviations reviewed individually
and accepted. One ambient concern (manager-core → executor-core
DTO dependency) flagged for a small CLAUDE.md clarification
post-merge; not a merge blocker.
This commit is contained in:
MechaCat02
2026-06-02 07:13:14 +02:00
parent 5a95ff2d07
commit 2796f36fef

151
REVIEW.md Normal file
View File

@@ -0,0 +1,151 @@
# v1.1.1 Audit & Review
**Branch:** `feat/v1.1.1-storage-and-events`
**Base:** `main` (v1.1.0)
**Commits ahead:** 12
**Audited by:** reviewer (this report)
**Audited against:** `docs/v1.1.x-design-notes.md` §14 (Decided 2026-06-01) + the original v1.1.1 dispatch prompt
## Verdict
**APPROVE — ready to merge to `main` as v1.1.1.**
The implementation is faithful to every load-bearing decision in the design notes. Static checks are green, the workspace test suite passes (243 tests pass, 132 properly-ignored DB-backed cases, 0 failures), the schema matches Layout E exactly, and the documented deviations are all defensible. There is one ambient concern about a cross-crate dependency that should be reflected in `CLAUDE.md` after the merge, but it is not a merge blocker.
---
## 1. Static checks reproduced
```
cargo fmt --all -- --check ✅ clean
cargo clippy --all-targets --all-features -- -D warnings ✅ no findings
cargo test --workspace ✅ 243 passed / 0 failed
(132 ignored — DB-backed integration tests,
same convention as v1.1.0; documented in HANDBACK §4)
```
Test distribution per crate matches HANDBACK §4:
- manager-core: 63
- orchestrator-core: 56
- stdlib: 43
- sdk_contract: 30
- picloud: 21
- executor-core (engine): 17
- sdk_kv: 7
- shared: 6
47 of these are new in v1.1.1; the rest are v1.1.0's existing suite still passing.
## 2. Design-notes conformance (spot-checks)
| Decision | Where it lives | Verdict |
|---|---|---|
| Layout E trigger storage (parent + per-kind detail) | [0008_triggers.sql:22-72](crates/manager-core/migrations/0008_triggers.sql#L22-L72) | ✅ matches exactly; parent has common columns + the four retry/dispatch knobs + `registered_by_principal`; per-kind detail tables for `kv` and `dead_letter` only |
| `routes` stays separate from `triggers` parent | [0012_routes_dispatch_mode.sql](crates/manager-core/migrations/0012_routes_dispatch_mode.sql), [0009_outbox.sql:13-18](crates/manager-core/migrations/0009_outbox.sql#L13-L18) | ✅ HTTP rows use `source_kind = 'http'` and `trigger_id` references `routes.id`; non-HTTP references `triggers.id`; polymorphism in Rust per the design-notes deferral of the column-set refinement |
| Sync HTTP via outbox + NATS-style inbox | [inbox.rs:30-89](crates/orchestrator-core/src/inbox.rs#L30-L89), [dispatcher.rs:359-394](crates/manager-core/src/dispatcher.rs#L359-L394) | ✅ `oneshot::Sender<InboxResult>` keyed by inbox_id; `deliver()` returns `Delivered` or `Abandoned` exactly per the design-notes failure-mode table |
| `reply_to.is_some()` never retries | [dispatcher.rs:376-394](crates/manager-core/src/dispatcher.rs#L376-L394) | ✅ failure path checks `reply_to` first; delivers single outcome to inbox; deletes outbox row regardless of error |
| Status code table (422/502/503/504/507/500) | [dispatcher.rs:555-564](crates/manager-core/src/dispatcher.rs#L555-L564), test [`failure_kind_status_codes_match_design_notes`](crates/manager-core/src/dispatcher.rs#L674) | ✅ exact mapping; covered by a dedicated test |
| `dispatch_mode = async` returns `202 Accepted` + JSON body | [api.rs:325-332](crates/orchestrator-core/src/api.rs#L325-L332) | ✅ body shape is `{"accepted_at": rfc3339, "execution_id": uuid}` — matches design notes §2 verbatim |
| Default retry: 3/exp/1000ms/±20% jitter | [trigger_config.rs](crates/manager-core/src/trigger_config.rs), tests [`exponential_backoff_doubles_per_attempt`](crates/manager-core/src/dispatcher.rs#L621), [`jitter_within_pct_of_base`](crates/manager-core/src/dispatcher.rs#L647) | ✅ env-overridable; jitter test exercises the ±20% bound across 100 samples |
| `abandoned_executions` written on dropped receiver | [dispatcher.rs:480-509](crates/manager-core/src/dispatcher.rs#L480-L509) | ✅ written only when `InboxDeliveryOutcome::Abandoned` returns; ordinary timeout-with-receiver-still-alive does not write a row |
| Dead-letter recursion stop (flag on execution) | [dispatcher.rs:396-425](crates/manager-core/src/dispatcher.rs#L396-L425), [trigger_repo.rs `TriggerKind::DeadLetter` → `is_dead_letter_handler`](crates/manager-core/src/dispatcher.rs#L228-L229) | ✅ flag set when dispatcher resolves a `kind = 'dead_letter'` trigger; on failure, original DL annotated with `resolution = 'handler_failed'`, row deleted, never retried, never DL'd |
| Sync HTTP failures do NOT dead-letter | [dispatcher.rs:378-394](crates/manager-core/src/dispatcher.rs#L378-L394) | ✅ early return before the DL-write block |
| `dead_letters::list` NOT shipped (deferred to v1.2) | [executor-core/src/sdk/dead_letters.rs:13](crates/executor-core/src/sdk/dead_letters.rs#L13) | ✅ explicit doc-comment citing design notes §4; only `replay` + `resolve` registered |
| Trigger execution runs as registrant's principal | [dispatcher.rs:249-253](crates/manager-core/src/dispatcher.rs#L249-L253) + [`registered_by_principal` column](crates/manager-core/migrations/0008_triggers.sql#L39) | ✅ principal resolved from the trigger row at dispatch time |
| 30-day DL retention, env-overridable | [gc.rs](crates/manager-core/src/gc.rs) | ✅ |
| 7-day abandoned-executions retention | [gc.rs](crates/manager-core/src/gc.rs) | ✅ |
| Trigger-depth limit (default 8); depth-exceeded does NOT dead-letter | [dispatcher.rs:122-137](crates/manager-core/src/dispatcher.rs#L122-L137) | ✅ design-notes §4 honored ("depth-exceeded means you built a loop") — row dropped + logged, no DL spam |
| Dashboard surface: badge + list view + Replay + Mark resolved | [dashboard/src/routes/apps/+page.svelte](dashboard/src/routes/apps/+page.svelte), [dashboard/src/routes/apps/\[slug\]/dead-letters/+page.svelte](dashboard/src/routes/apps/[slug]/dead-letters/+page.svelte) | ✅ all required columns + actions + expandable row detail; `npm run check` reports 0 errors |
| Status: workspace 1.1.0 → 1.1.1, SDK 1.1 → 1.2, dashboard 0.6.0 → 0.7.0, CHANGELOG.md created | last commit `66b661f` | ✅ |
| `ctx.event` shape (KV: source/op/collection/key/value; DL: original/attempts/last_error/ids/timestamps) | [shared/src/trigger_event.rs](crates/shared/src/trigger_event.rs), [executor-core engine tests](crates/executor-core/src/engine.rs) | ✅ matches design notes §4 shape exactly; tests verify both variants + the "absent for direct invocations" rule |
I sampled the design-notes diff (`git diff main..HEAD -- docs/v1.1.x-design-notes.md`) — every "Decided 2026-06-01" entry the agent absorbed into commit `1efb350` matches the decisions made in conversation. No drift.
## 3. Deviations from the prompt (all reviewed, all acceptable)
The HANDBACK's §3 lists nine deviations / mid-implementation decisions. My take on each:
1. **Outbox column set chosen** (`script_id`, `trigger_id` polymorphic, `claimed_by TEXT`, `trigger_depth`, `root_execution_id` denormalized; no `is_dead_letter_handler` column). The design notes explicitly deferred this set to implementation. The chosen shape is sensible: dispatcher can build `ExecRequest` without re-joining; the `is_dead_letter_handler` derivation from `triggers.kind` at dispatch time is cleaner than storing redundant state. ✅
2. **KV pagination is cursor-style** (base64-encoded last-key, 100 default / 1000 max). The prompt left this open; cursor-style is the right default for KV-shaped data. ✅
3. **KV TTL deferred**. Blueprint §8.1 reserved `expires_at` but v1.1.1 SDK doesn't surface TTL. Omitting the column from migration 0007 keeps the schema minimal; adding it later is a non-breaking forward migration. ✅ (CHANGELOG records the deferral.)
4. **Authz scope mapping** (4 new capabilities mapped to existing 7 scopes — `AppKvRead → script:read`, `AppKvWrite → script:write`, `AppManageTriggers → app:admin`, `AppDeadLetterManage → app:admin`). The "seven-scope commitment" is a project convention in `crates/shared/src/auth.rs:103` the prompt didn't mention; honoring it is correct. The specific mapping is defensible: a token with `script:read` on an app already implies "can see the data behind those scripts," and admin-level scope for trigger/DL management is standard for control-plane operations. ✅
5. **Script-as-gate authz** (`if cx.principal.is_some()` then check; else skip — public HTTP runs anonymously without an authz failure). This matches the SDK-shape doc's note that "the data plane is unauthenticated by default — public HTTP scripts run with `None`." Cross-app isolation is preserved regardless (every query keyed by `cx.app_id`). DL replay/resolve correctly bypasses this and hard-requires a principal. ✅
6. **Trait split `OutboxRepo` vs `OutboxWriter`**. Orchestrator-core can't depend on manager-core; the small `OutboxWriter` trait in shared (one method) lets the orchestrator enqueue HTTP rows without inverting the dependency arrow. ✅ Pattern mirrors the existing `members_concrete`/`AuthzRepo` split.
7. **`InboxResolver` in shared, `InboxRegistry` in orchestrator-core**. Same split rationale. Cluster mode (v1.3+) swaps the impl behind the unchanged trait. ✅
8. **manager-core now depends on executor-core**. ⚠️ **See §4 below — flagged, accepted, but should be reflected in `CLAUDE.md`.**
9. **Sync HTTP via outbox is the default for user routes** (admin bypass `/api/v1/execute/{id}` keeps direct dispatch). Matches the design-notes decision; the bypass's direct path is acceptable for admin tooling speed. ✅
## 4. The one concern worth surfacing: manager-core → executor-core
`CLAUDE.md` working rules say:
> Honor the three-service boundary. Don't reach across `*-core` crates. If
> orchestrator-core needs something from manager-core, define a trait in
> shared and inject the impl.
The dispatcher in manager-core directly imports `ExecRequest`, `ExecResponse`, `ExecError`, and `InvocationType` from `executor-core`:
```rust
// crates/manager-core/src/dispatcher.rs:27
use picloud_executor_core::{ExecError, ExecRequest, ExecResponse, InvocationType};
```
The HANDBACK justifies this as "DTOs vs behavior — types are fine, behavior is the bright line." That's a defensible interpretation, but not what `CLAUDE.md` actually says.
**Two options the project can pick:**
- **(a) Accept the dependency and update `CLAUDE.md`** to clarify that the three-service boundary is about *behavior*, not *types*`ExecRequest`/`ExecResponse`/`ExecError` are transport DTOs and crossing the wire is normal. This is the lower-friction choice and matches how the agent's instincts ran.
- **(b) Refactor**: move `ExecRequest`/`ExecResponse`/`ExecError`/`InvocationType` to `shared`. About 200 lines of moves; would land cleanly as a follow-up PR.
**My recommendation: (a)**. The dispatcher genuinely needs to construct and interpret these types, and they're the natural "what the executor produces" surface — burying them in shared makes the executor's public API less self-contained. But the rule as currently written disagrees; we should pick one explicitly.
This is **not a merge blocker** for v1.1.1 — the implementation already exists and works. The CLAUDE.md update can land as a small commit on `main` after the merge.
## 5. Smaller observations (no action required)
- **HTTP outbox rows synthesize a `ResolvedTrigger` with a sentinel zero `AdminUserId`** ([dispatcher.rs:342](crates/manager-core/src/dispatcher.rs#L342)). The HANDBACK flags this as a code smell; I agree, but the cleaner shape (`enum DispatchTarget { Trigger(ResolvedTrigger), Http(HttpRoute) }`) is a refactor that doesn't belong in v1.1.1. Worth doing in v1.1.2 alongside the docs work since the dispatcher will gain another trigger kind.
- **Triggers parent `dispatch_mode` defaults to `'async'`** ([0008_triggers.sql:30](crates/manager-core/migrations/0008_triggers.sql#L30)) with `sync` allowed by the CHECK constraint but unsupported in v1.1.1 (sync trigger would mean firing inline with the originating mutation, which we don't do). The migration comment captures this; worth a future commit to either remove `'sync'` from the CHECK or use it for an `inline_pre_mutate` semantics if it ever makes sense. Not v1.1.1's problem.
- **Metric counters are TODO** at three call sites (`picloud_trigger_depth_exceeded`, `picloud_dead_letter_handler_failures`, `picloud_abandoned_executions_total`). The events are logged via `tracing::warn`/`error` in the meantime. Per the prompt and roadmap, metrics surface is v1.1.7+. ✅
- **Dispatcher tick cadence is 100ms with `CLAIM_BATCH = 8`**, serial per tick. The ExecutionGate bounds total concurrent executions globally, so parallelism within a tick is purely an optimization. Reasonable MVP choice; can parallelize later without changing semantics.
- **Open Q1 in HANDBACK (claimed-rows-stuck-on-crash)** is a real cluster-mode concern, correctly out-of-scope for v1.1.1 (single dispatcher per process). Cluster mode adds a stale-claim sweeper — track for v1.3+.
- **Open Q3 in HANDBACK (HTTP-triggered scripts run with `principal: None`)** is correct as-is. The "trigger executions inherit the registrant's principal" decision applies to triggers; HTTP routes have no registrant in that sense. Public HTTP is anonymous by design.
## 6. Versioning audit
| File | Before | After | Status |
|---|---|---|---|
| Workspace `Cargo.toml` (workspace.package.version) | 1.1.0 | 1.1.1 | ✅ |
| SDK schema version (`shared/src/version.rs`) | 1.1 | 1.2 | ✅ correctly bumped — the SDK surface added `KvService` + `DeadLetterService` + `TriggerEvent` |
| Dashboard `package.json` | 0.6.0 | 0.7.0 | ✅ |
| Migrations | 0001..0006 | 0007..0012 added | ✅ sequential, no skips |
| CHANGELOG.md | not present | created at repo root | ✅ first entry covers v1.1.1 |
## 7. Manual smoke recommendation
The reviewer (you) does **not** need to run the manual end-to-end smoke before merging — the automated tests + the static review above cover the contracts. The smoke flow in HANDBACK §7 is worth running **after merge** as a release-validation step before tagging `v1.1.1` (if the project tags releases). Specifically:
1. `docker compose up -d` (fresh DB)
2. `cargo run -p picloud`
3. Create app + script-that-throws + KV trigger
4. Trigger a KV write → wait ~7s → confirm DL row appears
5. Dashboard: red badge on apps list, list view shows the row, Replay creates a new outbox row + dispatcher re-runs, Mark resolved sets `resolution = 'ignored'`
6. Async route test: `POST /work` with `dispatch_mode=async` route → expect 202 + JSON body
If any of those misbehave post-merge, revert is straightforward (12 commits, ahead of main, no dependencies have pulled changes yet).
## 8. Recommended next steps (post-merge)
1. **Merge** `feat/v1.1.1-storage-and-events` into `main` (fast-forward; branch is linear ahead).
2. **Tag** `v1.1.1` if release tagging is the project convention (git log shows v1.1.0 had a release commit but I didn't see a tag — confirm with the project owner).
3. **Small CLAUDE.md update** clarifying the three-service boundary's scope (types crossing is fine; behavior crossing is what's prohibited). One-paragraph change.
4. **Pause** before dispatching the v1.1.2 (Documents) agent — the v1.1.1 work shipped substantial infrastructure that v1.1.2 will lean on, and there may be small lessons from the v1.1.1 implementation to fold into the v1.1.2 prompt (e.g., reaffirming the "manager-core depends on executor-core for DTOs" pattern explicitly so the docs agent doesn't second-guess it).
Branch is ready for merge. Verdict: **APPROVE**.