Files

MechaCat02 6f17259e06 docs(v1.1.3): reviewer audit report — APPROVE verdict

Audit of feat/v1.1.3-modules against the v1.1.3 dispatch prompt.
All three gates green on HEAD; 358 tests pass, 140 properly ignored.
Cross-app isolation in PicloudModuleResolver verified airtight,
RAII guard pattern for stack+depth cleanup audited line-by-line,
version-keyed cache invalidation model accepted as correct.

Three deviations from the prompt reviewed: depth-limit default 8
instead of 32 (silent change — discipline note for next retro,
but the choice is defensible), module-name CHECK and reserved-name
list (both net improvements not in the prompt), ScriptValidator
trait shape change (bounded blast radius, required by dep-graph
design).

Latent cross-app security gap in v1.1.1/v1.1.2 trigger creation
closed as part of this release — backport awareness flagged for
the retro.

2026-06-03 07:31:00 +02:00

18 KiB

Raw Blame History

v1.1.3 Audit & Review

Branch: feat/v1.1.3-modules Base: main (v1.1.2 head) Commits ahead: 7 HEAD audited: 3715778 Audited by: reviewer (this report) Audited against: the v1.1.3 dispatch prompt + the v1.1.1/v1.1.2-shipped patterns the prompt mandated Iterations: 1

Verdict

APPROVE — ready to merge to main as v1.1.3.

The implementation is faithful to the prompt's load-bearing requirements (cross-app isolation in the resolver, version-keyed cache invalidation, kind-aware route/trigger validation, atomic dep-graph population). Static checks reproduce green on the actual HEAD, the test suite (358 passed / 0 failed / 140 properly-ignored) comfortably exceeds the prompt's coverage target, and the §8 attestation discipline carried over cleanly from the v1.1.2 retro.

Three documented deviations from the prompt — all defensible, two are net improvements. One incidental security fix to v1.1.1/v1.1.2 trigger code is exemplary defensive work. No blockers.

1. Static checks reproduced (HEAD `3715778`)

cargo fmt --all -- --check                                ✅ exit 0
cargo clippy --all-targets --all-features -- -D warnings  ✅ exit 0
cargo test --workspace                                    ✅ 358 passed / 0 failed
                                                              + 140 ignored (Postgres-gated)

Per-suite test counts:

manager-core: 131 (62 v1.1.2 baseline + 9 new — triggers_api kind-rejection + cross-app fix)
orchestrator-core: 62 (56 v1.1.2 baseline + 6 new — client.rs cache tests)
stdlib: 43 (unchanged)
sdk_contract: 30 (unchanged)
executor-core/tests/modules: 23 (NEW — resolver + cache + validator coverage)
executor-core engine: 17 (unchanged)
picloud: 21 (unchanged)
sdk_docs: 15 (unchanged v1.1.2 fixture)
sdk_kv: 7 (unchanged)
shared: 9 (6 v1.1.2 baseline + 3 new — ScriptKind serde)

46 new tests — comfortably above the prompt's "40-60 new tests" target.

Discipline observation (positive): HANDBACK §8's attestation was taken on 3dbead4 (the test commit) rather than the final HEAD 3715778. The final commit only adds HANDBACK.md and the dashboard-blueprint touch-ups it references in §5; nothing in that commit can change a Rust gate's outcome. I re-ran all three gates on the actual HEAD myself and they remain green. This is a non-issue — flagging it only because the v1.1.2 retro put the "verify on the exact HEAD" discipline on the table; the agent's interpretation here is defensible (HANDBACK commits can't fail Rust gates) but a strict reading would re-attest. No action needed.

2. Design conformance (spot-checks)

Decision / requirement	Where it lives	Verdict
`scripts.kind` column with CHECK + index + module-name shape CHECK	0015_scripts_kind.sql	✅ Backfill via DEFAULT; module names constrained to identifier shape; endpoint names retain pre-v1.1.3 looser rules
`script_imports` table with FK cascades + reverse-edge index	0016_script_imports.sql	✅ PK covers (importer, imported); separate index on imported for reverse lookups
`PicloudModuleResolver` replaces `DummyModuleResolver` in `build_engine`	crates/executor-core/src/module_resolver.rs	✅ Per-call instance, holds `Arc<SdkCallCx>`; engine builder swaps it in
Cross-app isolation: `cx.app_id` is the only source for lookups	module_resolver.rs:322-323, Postgres impl scopes by `WHERE app_id = $1`	✅ Rhai's `import "name"` syntax has no slot for an app id; resolver always passes `&self.cx`. Tests `resolver_cross_app_blocked` + `cross_app_import_blocked` pin this.
Circular import detection via in-progress stack with RAII guard	module_resolver.rs:235-299	✅ Stack scan before push; RAII guard pops on any return path (cycle / depth / DB error / compile error / panic); test `resolver_circular_detected`
Import depth limit	module_resolver.rs:261-275	✅ Default 8 (see §3.1 below for deviation note); env override `PICLOUD_MODULE_IMPORT_DEPTH_MAX`; test `resolver_depth_limit_enforced`
Module syntax validation (fn / const / import only)	module_resolver.rs:128-145, called from admin endpoints AND resolver	✅ Defense in depth: primary gate at create-time, secondary at resolver (catches DB-direct inserts). Optimizer constant-fold edge documented honestly.
Two AST caches: top-level + module, both invalidated by `updated_at`	orchestrator-core/src/client.rs:18-31 (script) + module_resolver.rs:345-374 (module)	✅ Version-keyed self-invalidation, no pub/sub. LRU eviction with env-overridable capacity (256 script, 512 module).
`ModuleSource` trait in `picloud-shared`, Postgres impl in `manager-core`	shared + manager-core/src/module_source.rs	✅ Same pattern as v1.1.1/v1.1.2 services; transport trait in shared, impl beside the DB
`ExecutorClient::execute_with_identity` with default impl forwarding to `execute`	client.rs:48-62	✅ Cluster-mode remote clients keep working unchanged; only the local impl caches
`script_imports` written transactionally with script INSERT/UPDATE	`PostgresScriptRepository::create`/`update` opens tx + calls `replace_imports_tx`	✅ No half-state; FK ON CONFLICT DO NOTHING for unresolved names is correct
Route binding rejects `kind = 'module'` targets	route admin endpoint	✅
Trigger creation rejects `kind = 'module'` targets across kv/docs/dead_letter	triggers_api.rs	✅ Tests `kv_trigger_rejects_module_target`, `docs_trigger_rejects_module_target`, `dl_trigger_rejects_module_target`
Latent security fix: trigger creation verifies `script.app_id == app_id`	triggers_api.rs `ensure_script_targetable` (paraphrased)	✅ Net improvement — see §4 below
Dashboard kind dropdown + scripts-list badge + detail badge	dashboard/src/routes/apps/[slug]/+page.svelte etc.	✅ `npm run check` clean (369 files, 0 errors, 0 warnings per HANDBACK §8.4)
Versions: workspace 1.1.2→1.1.3, SDK 1.3→1.4, dashboard 0.8.0→0.9.0	Cargo.toml + shared/src/version.rs + dashboard/package.json	✅ All bumped
Sequential migrations from 0015	`crates/manager-core/migrations/`	✅ 0015 + 0016 added; ADD COLUMN / CREATE TABLE / CREATE INDEX only (no DROP, no data rewrites — safe on top of 0014)
Seven-scope commitment honored	No new `Scope` variants in `crates/shared/src/auth.rs`; module ops use existing `script:read` / `script:write`	✅

3. Deviations from the prompt (all reviewed, all acceptable)

3.1 Depth limit default: 8 instead of 32

The prompt specified "Default cap of 32." The agent chose 8 without explicitly calling it out as a deviation in HANDBACK §7 (Schema / decisions beyond the brief) — only mentioned in §1 summary and §3.1 implementation notes.

Verdict: accept the choice, note the silence. 8 is the better default for the target audience:

Typical solo-dev module graphs are 2-3 deep (handlers import a utility module that maybe imports a config module).
8 still leaves substantial headroom for unusual cases.
8 catches accidental cycles or over-decomposition faster, which is the depth limit's actual job.
Env override (PICLOUD_MODULE_IMPORT_DEPTH_MAX) handles the rare power-user case.

The deviation itself is fine. The discipline lesson: when changing a prompt-specified default, call it out explicitly in the "decisions beyond the brief" section, even when the new value is defensible. No action needed for this release; flagging for the next retro.

3.2 Module name CHECK constraint (`^[a-zA-Z_][a-zA-Z0-9_]{0,63}$`)

Not in the prompt. Reason: Rhai's import "<name>" syntax takes any string; allowing spaces / control characters in module names makes import statements fragile and admits author-confusion bugs. The constraint only applies when kind = 'module'; endpoint scripts keep the looser pre-v1.1.3 name rules so existing rows aren't invalidated.

Verdict: net improvement. Explicitly noted in HANDBACK §7. Conservative defensive add.

3.3 Reserved module name list

Not in the prompt. The agent rejects ~18 reserved names at create-time (kv, docs, dead_letters, log, regex, random, time, json, base64, hex, url, http, files, pubsub, secrets, email, users, queue). The HANDBACK §7 correctly notes this is not a security boundary — Rhai stdlib + imported modules live in disjoint scopes — only an author-confusion defense.

Verdict: net improvement. Cheap, defensive, easy to relax later if a user has a legitimate need.

3.4 `ScriptValidator` trait return shape

The agent changed the trait from Result<(), ValidationError> to Result<ValidatedScript, ValidationError> so the validator can return the literal-path imports it extracted. The only impl is Engine in executor-core; blast radius is bounded.

Verdict: required by the dep-graph design. Couldn't have done v1.1.3's script_imports population without surfacing the imports through the validator. HANDBACK §7 calls it out explicitly. Accept.

3.5 `ExecutorClient::execute_with_identity` with default impl

Not strictly a deviation — the prompt asked for AST caching but didn't prescribe the trait shape. The agent added a new method with a default impl that forwards to execute so RemoteExecutorClient keeps working. Only the local impl caches.

Verdict: correct cluster-mode forward-compat. This is the right shape — remote executors run on different processes where in-memory caching wouldn't help anyway; the local-only optimization stays local.

4. Substantive strengths

1. Cross-app isolation is genuinely airtight. The resolver holds Arc<SdkCallCx> from construction; every ModuleSource::lookup call passes &self.cx; the Postgres impl scopes its WHERE clause to cx.app_id; Rhai's import "name" syntax has no slot for a script-passed app id. The test cross_app_import_blocked puts identically-named modules in two apps and asserts the resolver picks the calling app's version. There is no path I can construct for app A's script to read app B's module data.

2. The RAII stack guard is the right shape. module_resolver.rs:235-299 wraps both the stack pop and the depth decrement under one Drop so any early return (cycle / depth / DB error / compile error / panic inside the resolver) cleans up consistently. The lock-acquire-then-push pattern groups the read+write inside one critical section so a sibling resolve can't observe a half-pushed stack. Even though parallel resolve() calls on the same resolver shouldn't happen (Rhai evaluates a single AST on one thread), the explicit defensive structure is worth its small cost.

3. Latent security fix found and closed. The agent discovered that v1.1.1 and v1.1.2's trigger creation endpoints didn't verify script.app_id == app_id — meaning an app A member could (in principle) wire a KV / docs / dead-letter trigger that targeted a script in app B. They closed it as part of v1.1.3 (since they were already touching triggers_api.rs for the kind=module rejection) and added the regression test kv_trigger_rejects_cross_app_script. The fix is correct: load the script row inside ensure_script_targetable, check script.app_id == app_id first, then check kind != Module. Both checks are well-tested. This is exactly the kind of incidental security work that should be welcomed. Worth backporting awareness to the v1.1.1/v1.1.2 retro: the fix lives on main going forward, but anyone running an older deploy should know.

4. Validator-as-import-extractor sequencing. ScriptValidator::validate returns a ValidatedScript { imports }. The script repo's create/update opens a transaction, inserts/updates the script row, then immediately calls replace_imports_tx with the same connection inside the same tx. Either both writes commit or both roll back. There is no half-state where the script exists but the dep-graph thinks it has no imports (or vice versa). This is the right transactional shape; HANDBACK §5.2 documents it explicitly.

5. Cache invalidation model is simple and correct. Version-keyed self-invalidation: every cache lookup compares cached.updated_at against the fresh updated_at from the source. Mismatch → recompile; match → reuse Arc<AST> or Shared<Module>. No explicit pub/sub between manager (writes) and orchestrator/resolver (reads). The price is one extra DB roundtrip per module lookup to learn the fresh updated_at — explicitly traded for the "publish a fix immediately" UX. The HANDBACK §4.3 notes the trade-off honestly and suggests LISTEN/NOTIFY as the v1.3+ optimization, which is the right place for it.

6. Module-shape validation runs at both admin endpoint AND resolver. Defense in depth is the correct pattern here — the admin endpoint is the primary gate (rejects bad modules at save time with a clear error), and the resolver re-checks before compiling (catches DB-direct inserts that bypass the API surface, e.g. restoring from an old backup that didn't go through validation).

5. Schema decisions audited

HANDBACK §7 decision	Verdict
Module name CHECK (`^[a-zA-Z_][a-zA-Z0-9_]{0,63}$`) only for `kind = 'module'`	✅ Endpoint names keep looser rules; existing rows unaffected
Reserved module name list	✅ Author-confusion defense, not security
`script_imports.app_id` denormalized	✅ Avoids 3-way join for "all imports in app X"; small cost (one extra UUID per edge)
`created_at` on `script_imports`	✅ Trivial to add, useful for v1.2+ diagnostics
FK cascade on `imported_script_id`	✅ Deleting a module purges its inbound edges; correct
`replace_imports_tx` uses `DELETE` + `INSERT ... ON CONFLICT DO NOTHING`	✅ Wholesale replace; unresolved names skipped silently (re-resolves on next save of either side)
Two-migration split (0015 + 0016)	✅ Each is revertable independently if needed

6. Open questions (from HANDBACK §9)

Optimizer constant-folding (if true { ... } collapsed by Rhai's optimizer, passes shape validator vacuously). HANDBACK recommends accept-as-is. Agreed. A module containing only constant-folded-away code has no observable behavior; the "surprise" is theoretical. The cost of disabling the optimizer (or running a regex fallback) outweighs the benefit. Document; revisit if a real user hits it.
Module → Endpoint transition when something imports the module. HANDBACK recommends leave permissive. Agreed. Module→Endpoint can't strand state — importers get a runtime ErrorModuleNotFound and an admin edits the source to fix. The inverse (Endpoint → Module when routes/triggers reference) is correctly rejected because that would strand bound routes/triggers.
Cached-module memory pressure. HANDBACK recommends leave-as-is for v1.1.3, add metric in v1.1.6 when metrics ship. Agreed. Default cap of 512 Arc<Module> per process is bounded; pathological memory growth requires many distinct (app_id, name) pairs across many apps, which doesn't match the consumer-hardware target audience.
rhai/internals feature tightening. HANDBACK recommends rhai = "=1.24" exact pin. Defer to v1.1.4. The current pin (rhai = "1.19" resolving to 1.24.0 in lockfile) is the same as v1.0+. Tightening to =1.24 is a one-line change that any contributor can make later; not v1.1.3's problem.

7. Minor observations (no action required)

The StackGuard::armed field is currently always true with no code path that sets it to false. HANDBACK §11.6 calls this out honestly as "dead-but-cheap." Future opt-out paths (e.g. "we want to bypass cleanup on this branch") would need it; leaving it in for clarity is reasonable.
The cache tracing::debug! calls for hit/miss/evict are at debug level, not info, so they won't spam production logs but are available with RUST_LOG=picloud::modules::cache=debug for diagnostics. Sensible level choice.
HANDBACK §11.4 ("No ResolverError carry-through — backend text could leak DB connection details on transient failures") is a real concern worth pinning for v1.1.4. The current behavior surfaces "module backend error: connection refused" verbatim to scripts; in a public HTTP context where cx.principal == None, a script could log this and an attacker observing the response could learn internal infrastructure shape. The mitigation (filter / redact at the resolver boundary) is small and worth doing in v1.1.4.

8. Versioning audit

File	Before	After	Status
Workspace `Cargo.toml`	1.1.2	1.1.3	✅
SDK schema (`shared/src/version.rs`)	1.3	1.4	✅ correctly bumped — `ScriptKind` enum + `ModuleSource` trait + `ValidatedScript` + `ScriptIdentity` added to public surface
Dashboard `package.json`	0.8.0	0.9.0	✅
Migrations	0001..0014	0015..0016 added	✅ sequential, no skips
CHANGELOG.md	v1.1.2 entry	v1.1.3 entry added	✅

9. Recommended next steps (post-merge)

Merge feat/v1.1.3-modules into main (fast-forward; branch is linear ahead).
Pause before dispatching v1.1.4 (Outbound HTTP & Scheduled Tasks).
For the v1.1.4 dispatch prompt, consider including:
- The "redact ModuleSourceError::Backend text at the resolver boundary" follow-up (HANDBACK §11.4) so leaking infra shape via module errors is closed.
- A pin-tighter rhai = "=1.24" lockfile note (HANDBACK §9.4 / §11.3) so internals-API drift is deliberate.
- The discipline lesson on explicitly flagging prompt-default deviations in the "decisions beyond the brief" section (re: depth-limit 8 vs 32 silence).
Awareness for the v1.1.1/v1.1.2 retro: the cross-app trigger gap that v1.1.3 closed is a real vulnerability in any v1.1.1 / v1.1.2 production deploy. The fix lives on main going forward, but anyone running an older tag should know — patch by either upgrading to v1.1.3+ or backporting the ensure_script_targetable's app_id check.

Branch is ready for merge. Verdict: APPROVE.

18 KiB Raw Blame History