Audit of feat/v1.1.5-files-pubsub against the v1.1.5 dispatch prompt. All gates green on HEAD; 491 tests pass (+64 new), 139 ignored. Atomic write protocol audited line-by-line: single-pass SHA-256, temp→fsync→rename→fsync-dir→DB sequence as specified, unique pid+ counter temp suffix, path-traversal defense at SDK boundary and repo. Pub/sub fan-out is correctly transactional (single tx begin+commit; one outbox row per matching subscriber; trigger_depth saturating- bumped). Topic pattern matcher rejects every shape the brief called out (*.created, **, a.*.b, user.*x, *user, empty). Three flagged open questions resolved: orphan-sweep deferred (matches planning decision), test count 63 vs 70 (defensible — gap is the dispatcher e2e test, which is already covered for kv/docs/cron via the shared dispatcher path), empty-blob = missing-data (defensible interpretation, relaxable later). First CI workflow added; schema_snapshot un-ignored with DATABASE_URL- absent skip path.
157 lines
18 KiB
Markdown
157 lines
18 KiB
Markdown
# v1.1.5 Audit & Review
|
||
|
||
**Branch:** `feat/v1.1.5-files-pubsub`
|
||
**Base:** `main` (v1.1.4 head)
|
||
**Commits ahead:** 4 (3 substantive + handback)
|
||
**HEAD audited:** `9492c18`
|
||
**Audited by:** reviewer (this report)
|
||
**Audited against:** the v1.1.5 dispatch prompt + the v1.1.1–v1.1.4 patterns it mandated
|
||
**Iterations:** 1
|
||
|
||
## Verdict
|
||
|
||
**APPROVE — ready to merge to `main` as v1.1.5.**
|
||
|
||
Both new services are faithful to the prompt's load-bearing requirements: the atomic write protocol matches the spec step-for-step, the pub/sub fan-out is correctly transactional with one outbox row per matching subscriber, topic pattern matching rejects every shape the brief said to reject. The commit split is cleanly per-feature (3 commits vs v1.1.4's single mega-commit — the agent acted on the v1.1.4 retro lesson without being asked). The CI follow-up landed: schema-snapshot un-ignored with a `DATABASE_URL`-absent skip path, plus the first CI workflow added.
|
||
|
||
Three open questions raised in HANDBACK §9 — orphan sweep deferred (confirmed during planning), 63-vs-target-70 test count (defensible — see §4 below), empty-blob-as-missing-data interpretation (defensible — see §4 below). None are blockers.
|
||
|
||
---
|
||
|
||
## 1. Static checks reproduced (HEAD `9492c18`)
|
||
|
||
```
|
||
cargo fmt --all -- --check ✅ exit 0
|
||
cargo clippy --all-targets --all-features -- -D warnings ✅ exit 0
|
||
cargo test --workspace ✅ 491 passed / 0 failed
|
||
+ 139 ignored (Postgres-gated; one
|
||
less than v1.1.4 because
|
||
schema_snapshot moved out of
|
||
#[ignore])
|
||
```
|
||
|
||
Per-suite test counts (delta from v1.1.4 baseline):
|
||
- manager-core: 229 (was 184 → +45; files repo + service + admin API + pubsub repo + service + admin endpoint + their tests)
|
||
- executor-core/tests/sdk_files: 14 (NEW — bridge integration)
|
||
- executor-core/tests/sdk_pubsub: 5 (NEW — bridge integration)
|
||
- executor-core/tests/sdk_http: 15 (unchanged)
|
||
- executor-core/tests/sdk_docs: 15 (unchanged)
|
||
- executor-core/tests/modules: 23 (unchanged)
|
||
- orchestrator-core: 62 (unchanged)
|
||
- stdlib: 43 (unchanged)
|
||
- sdk_contract: 30 (unchanged)
|
||
- executor-core engine: 17 (unchanged)
|
||
- picloud: 21 (unchanged)
|
||
- module_redaction_logging: 1 (unchanged)
|
||
- shared: 8 (was 9 → −1; one moved into pubsub module's own tests + tracker drift)
|
||
- sdk_kv: 7 (unchanged)
|
||
- schema_snapshot: 1 (NEW — un-ignored; skips when DATABASE_URL unset)
|
||
|
||
Net: 64 new tests on my counting (HANDBACK says 63; immaterial off-by-one). Comfortably below the 70–90 prompt target — see §4 for whether that gap matters.
|
||
|
||
## 2. Design conformance (spot-checks)
|
||
|
||
| Decision / requirement | Where it lives | Verdict |
|
||
|---|---|---|
|
||
| Collection-scoped files (`(app_id, collection, id)`) | [0018_files.sql](crates/manager-core/migrations/0018_files.sql) | ✅ Primary key + server-generated UUID; matches the agreed expansion of the blueprint's app-flat sketch |
|
||
| Filesystem path `<root>/files/<app_id>/<collection>/<id[0:2]>/<id>` | [files_repo.rs:228-238 shard_dir_at + final_path_at](crates/manager-core/src/files_repo.rs#L228-L238) | ✅ Sharded by first two chars of UUID; `0o700` permissions via `create_dir_all_secure` |
|
||
| **Atomic write protocol (temp→fsync→rename→fsync_dir→DB)** | [files_repo.rs:244-277 write_atomic_at](crates/manager-core/src/files_repo.rs#L244-L277) | ✅ Steps 2–6 exactly as the prompt spec; DB INSERT is step 7 in the impl above; unique temp suffix `<id>.tmp.<pid>-<atomic_counter>` avoids collisions; parent-dir fsync after rename |
|
||
| Single-pass SHA-256 (file never re-read on write) | [files_repo.rs:258-260](crates/manager-core/src/files_repo.rs#L258-L260) | ✅ Hash the in-memory `&[u8]` once during the same call that writes it |
|
||
| Checksum-on-get throws Corrupted, no auto-delete | [files_repo.rs:282-299 read_verify_at](crates/manager-core/src/files_repo.rs#L282-L299) | ✅ Logs at error level with path, returns `FilesError::Corrupted`, never auto-deletes |
|
||
| Atomic delete (row inside tx; unlink outside) | files_repo.rs delete impl | ✅ Per HANDBACK §3; orphan unlink logged at warn |
|
||
| **Path-traversal validation at SDK boundary + repo** | [files_repo.rs:201-211 guard_collection](crates/manager-core/src/files_repo.rs#L201-L211) + `picloud_shared::validate_files_collection` | ✅ Rejects empty, `/`, `\`, `..`, NUL. Defense in depth (SDK + repo). |
|
||
| Trigger payloads exclude blob bytes | `TriggerEvent::Files` shape carries metadata only | ✅ Per HANDBACK §3; design notes mandate |
|
||
| Per-file size cap 100 MB; `PICLOUD_FILES_MAX_FILE_SIZE_BYTES` override | [files_repo.rs:50, 106-115 FilesConfig::from_env](crates/manager-core/src/files_repo.rs#L50) | ✅ |
|
||
| `files:*` trigger kind (Layout E extension) | [0019_files_triggers.sql](crates/manager-core/migrations/0019_files_triggers.sql) | ✅ Mirrors 0014/0017 pattern; `ops TEXT[]` + `collection_glob` mirrors KV |
|
||
| `Capability::AppFilesRead/Write` → `script:read/write` | manager-core::authz extensions | ✅ Seven-scope commitment held |
|
||
| `pubsub::publish_durable(topic, message)` | shared/pubsub.rs + executor-core/src/sdk/pubsub.rs | ✅ Single function; explicit `_durable` suffix matches §1 design-notes decision |
|
||
| **Publish-time transactional fan-out (one outbox row per matching subscriber)** | [pubsub_repo.rs:70-117 fan_out_publish](crates/manager-core/src/pubsub_repo.rs#L70-L117) | ✅ Single `tx` begins, SELECTs enabled pubsub triggers for app, filters topic in Rust, INSERTs one outbox row per match, commits once. Cross-app gate via `WHERE t.app_id = $1`. `trigger_depth` saturating-bumped, `root_execution_id` propagated. |
|
||
| No-match publish succeeds silently | pubsub_repo.rs returns `Ok(0)` when no triggers match | ✅ |
|
||
| Topic pattern matching: exact / prefix.* / universal `*` | [shared/pubsub.rs:65-74 topic_matches](crates/shared/src/pubsub.rs#L65-L74) | ✅ Uses `strip_suffix('*')` — clean implementation; `prefix` retains the trailing `.` so `"user.*"` doesn't match `"users.created"` |
|
||
| **Mid-pattern wildcards rejected at validation** | [shared/pubsub.rs:85-100 validate_topic_pattern](crates/shared/src/pubsub.rs#L85-L100) | ✅ Tests pin rejection of `*.created`, `**`, `a.*.b`, `user.*x`, `*user`, empty |
|
||
| `pubsub:*` trigger kind (Layout E extension) | [0020_pubsub_triggers.sql](crates/manager-core/migrations/0020_pubsub_triggers.sql) | ✅ No `ops` column (publish is single-implicit-op); partial index `idx_triggers_app_pubsub_enabled` |
|
||
| `Capability::AppPubsubPublish` → `script:write`; subscription via `AppManageTriggers` | manager-core::authz extensions | ✅ Seven-scope commitment held |
|
||
| Cross-app isolation in publish + fan-out | `WHERE t.app_id = $1` at SELECT; `app_id` bound on every outbox insert | ✅ HANDBACK §10 covers; tests assert |
|
||
| **CI workflow + schema_snapshot un-ignore** | [.github/workflows/ci.yml](.github/workflows/ci.yml) + schema_snapshot.rs | ✅ First CI workflow ever; postgres:15 service; rust + dashboard jobs; schema_snapshot tokio_test that skips when `DATABASE_URL` unset and otherwise runs migrations and verifies golden |
|
||
| Schema golden re-blessed for v1.1.5 (includes `files`, `files_trigger_details`, `pubsub_trigger_details`, widened CHECKs, both new indexes) | expected_schema.txt | ✅ Per HANDBACK §5 |
|
||
| Versions: workspace 1.1.4→1.1.5, SDK 1.5→1.6, dashboard 0.10.0→0.11.0 | Cargo.toml + version.rs + package.json | ✅ All bumped |
|
||
| Migrations sequential 0018→0020 | migrations/ | ✅ |
|
||
|
||
## 3. Substantive strengths
|
||
|
||
**1. The commit split.** v1.1.4 shipped as one coherent mega-commit because the agent's tooling didn't support interactive hunk staging. The v1.1.4 retro implicitly raised the question. The v1.1.5 agent split the work cleanly into `feat(v1.1.5): files SDK + files:* triggers` → `feat(v1.1.5): pubsub::publish_durable SDK + pubsub:* triggers` → `chore(v1.1.5): version bumps, CI workflow, schema-snapshot un-ignore`, each independently green. HANDBACK §1 explicitly notes that the additive shape — pubsub capability and dashboard type-union present-but-unused in commit 1 — was deliberate. This is the right shape for trunk-based review.
|
||
|
||
**2. The atomic write protocol is implemented exactly to spec.** Steps 2–6 live in `write_atomic_at` ([files_repo.rs:244-277](crates/manager-core/src/files_repo.rs#L244-L277)) as a free function, which makes the fs mechanics unit-testable without a Postgres pool. The unique temp suffix uses pid + monotonic counter (no `rand` dep), and parent-dir fsync is best-effort with `let _ = dirf.sync_all()` — correct because the rename is durable on most filesystems even without the dir fsync, but we want it where supported. The protocol comment block (lines 10-23) is excellent documentation of the rollback semantics at each step.
|
||
|
||
**3. The pub/sub fan-out is correctly transactional.** [pubsub_repo.rs:70-117](crates/manager-core/src/pubsub_repo.rs#L70-L117) opens one transaction, SELECTs all enabled pubsub triggers for the app (cross-app guard at `WHERE t.app_id = $1`), filters in-process via `topic_matches`, INSERTs one outbox row per match, commits once. A partial fan-out is impossible: either every matching subscriber gets a delivery row or none do. `trigger_depth` is bumped via `saturating_add(1)` (correct — the publishing script's own depth + 1), and `root_execution_id` is propagated so the audit log groups all deliveries with their originating publish.
|
||
|
||
**4. Topic pattern matching is clean and precise.** The `topic_matches` implementation ([shared/pubsub.rs:65-74](crates/shared/src/pubsub.rs#L65-L74)) uses `strip_suffix('*')` — a one-line check that elegantly handles the three supported shapes. Crucially, `"user.*"` strips to `"user."` (including the dot), so `topic_matches("user.*", "users.created")` correctly returns false. `validate_topic_pattern` rejects all six unsupported shapes the prompt called out, with snapshot-pinned error wording.
|
||
|
||
**5. Path traversal defense in depth.** `validate_files_collection` lives in `picloud-shared` and runs at the SDK boundary; `guard_collection` in the repo runs again before any filesystem operation. Both reject empty, `/`, `\`, `..`, NUL. A crafted collection name can't escape the app's root tree even if the SDK gate misfires.
|
||
|
||
**6. Discipline carryover.** Every prompt-default deviation is in HANDBACK §7 (empty-blob = missing-data, admin REST API addition, admin delete doesn't emit trigger event, list bridge accepts two forms, glob semantics reused, schema_snapshot DB scoping, orphan sweep confirmed deferred). The §8 attestation is taken on the implementation commit `4595db7` with explicit note that the HANDBACK commit is pure markdown. The v1.1.2/v1.1.3/v1.1.4 retro lessons stuck.
|
||
|
||
**7. CI workflow lands.** This is the first `.github/workflows/ci.yml` in the project — the v1.1.4 retro recommendation acted on without prompting. The workflow runs fmt + clippy + the full workspace tests against a postgres:15 service, plus the dashboard `npm run check` as a separate job. Schema golden silent drift across v1.1.1–v1.1.3 is now a regression the CI catches automatically.
|
||
|
||
**8. Schema-snapshot skip path is well-judged.** The test calls `tokio::test` instead of `sqlx::test`, checks `DATABASE_URL`, and skips with a clear `tracing::warn` line when unset. This means `cargo test --workspace` stays green for local devs without a DB while CI (which has the env var) actually verifies the schema. The tradeoff — that the live-DB path applies migrations to whatever DB you point at, not an isolated temp — is documented in HANDBACK §5 and is acceptable given CI's fresh Postgres.
|
||
|
||
## 4. Open questions answered
|
||
|
||
HANDBACK §9 raises three:
|
||
|
||
### 4.1 Orphan-sweep deferral
|
||
|
||
**Verdict: accept.** Confirmed during planning. The cost of waiting is small (KBs per crashed write, no correctness risk — orphans are never SDK-readable). Defer to v1.1.6+ where the sweep daemon can be designed alongside whatever other operator-facing reclamation surfaces emerge.
|
||
|
||
### 4.2 Test count 63 vs the 70-90 target
|
||
|
||
**Verdict: accept the undershoot.**
|
||
|
||
The agent's argument is sound: every named critical test in the prompt's §8 is present (atomic write rollback, checksum tampering, cross-app, path traversal, authz, fan-out transactional rollback, topic pattern shapes including all six rejections, multiple-matches, blob-to-base64). The shortfall is the **dispatcher end-to-end DB test** — publish → outbox row → dispatcher delivers → handler sees `ctx.event`.
|
||
|
||
But: that end-to-end path is *entirely* through code that v1.1.1/v1.1.2/v1.1.4 already exercise. The dispatcher's `Files | Pubsub` match-arm extension is a one-line change. The handler's `ctx.event` serialization goes through the same generic `build_exec_request` path as KV/docs/cron. Adding a v1.1.5-specific e2e test would duplicate coverage that's already there for siblings.
|
||
|
||
If we wanted dispatcher e2e tests, they should be a workspace-wide effort (one test per trigger kind, gated on `DATABASE_URL`, picking up the new CI workflow's Postgres). That's a meaningful follow-up — worth flagging for v1.1.6 — but not v1.1.5's problem.
|
||
|
||
### 4.3 Empty-blob = missing-data
|
||
|
||
**Verdict: accept the deviation; relaxable later.**
|
||
|
||
The agent rejected 0-byte blobs at `NewFile::validate` / `FileUpdate::validate` with `MissingField("data")`. The prompt said `data` is required and the tests check "missing data"; the agent's interpretation is "empty == missing" which is internally consistent.
|
||
|
||
The cost: v1.1.5 can't store an intentionally-empty file. The benefit: simpler validation and clearer error messages ("missing data" vs "empty data"). For the target audience this is the right trade-off — apps that genuinely need empty-file semantics can either store a one-byte sentinel or wait for v1.2 to relax it. Easy non-breaking change later (drop the empty check; existing rows untouched).
|
||
|
||
Flag for v1.1.6 prompt: confirm the relaxation isn't urgent before locking in the behavior across two releases.
|
||
|
||
## 5. Smaller observations (no action required)
|
||
|
||
- **Admin file-delete bypasses `files:delete` trigger emission.** HANDBACK §7 #3 flagged this. The reasoning is sound — admin actions shouldn't fire user-defined triggers because that creates event storms during cleanup runs and conflates operator-driven mutations with script-driven ones. SDK deletes still emit; only the admin REST endpoint skips. Reasonable.
|
||
- **Admin files REST API addition** ([files_api.rs](crates/manager-core/src/files_api.rs)) was needed to back the dashboard view. Mirrors `triggers_api`'s direct-repo + capability pattern. HANDBACK §7 #2 flagged it.
|
||
- **`files` `list` bridge accepts both positional and map forms** (HANDBACK §7 #4). Additive convenience; the map form matches the prompt's example. Fine.
|
||
- **Collection-glob dialect reuses the existing `collection_matches`** (`*` / `foo*` prefix / exact) instead of introducing a new `"prefix:*"` form. Right call — keeping parity with KV/docs trigger semantics. HANDBACK §7 #5 flagged it.
|
||
- **`shared::pubsub::NoopPubsubService`** is added for the executor-core integration test harness — every call returns `PubsubError::Unavailable`. Same pattern as the existing `NoopEventEmitter`. Clean.
|
||
- **The publish saturating-add for `trigger_depth`** ([pubsub_repo.rs:107](crates/manager-core/src/pubsub_repo.rs#L107)) means a publish from depth-`u32::MAX` won't panic. That's already capped by `PICLOUD_MAX_TRIGGER_DEPTH` (default 8) at the dispatcher, but defensive overflow handling is correct.
|
||
- **`shared/src/pubsub.rs` tests** include four named cases (exact, prefix wildcard, universal, validation) with subcases — clean test taxonomy.
|
||
|
||
## 6. Versioning audit
|
||
|
||
| File | Before | After | Status |
|
||
|---|---|---|---|
|
||
| Workspace `Cargo.toml` | 1.1.4 | 1.1.5 | ✅ |
|
||
| SDK schema (`shared/src/version.rs`) | 1.5 | 1.6 | ✅ correctly bumped — `FilesService`, `PubsubService`, `FileMeta`, `NewFile`, `FileUpdate`, `topic_matches`, `validate_topic_pattern`, `TriggerEvent::{Files, Pubsub}` added to public surface |
|
||
| Dashboard `package.json` | 0.10.0 | 0.11.0 | ✅ |
|
||
| Migrations | 0001..0017 | 0018..0020 added | ✅ sequential, no skips |
|
||
| CHANGELOG.md | v1.1.4 entry | v1.1.5 entry added | ✅ |
|
||
|
||
## 7. Recommended next steps (post-merge)
|
||
|
||
1. **Merge** `feat/v1.1.5-files-pubsub` into `main` (fast-forward; branch is linear ahead).
|
||
2. **Pause** before dispatching v1.1.6 (Realtime Channels & Client Library — the co-shipped SSE + `@picloud/client` work).
|
||
3. **For the v1.1.6 dispatch prompt**, consider folding in:
|
||
- **Dispatcher end-to-end DB tests** for each trigger kind. This is broader than v1.1.5 — it's a workspace-wide hygiene task. Now that CI has a Postgres service (per v1.1.5's `.github/workflows/ci.yml`), gating these tests on `DATABASE_URL` lets them run in CI without breaking local `cargo test`. Cost is bounded; the goal is to catch dispatcher regressions before they surface as production trigger silence.
|
||
- **Empty-blob storage** — revisit whether `data: 0 bytes` should be a valid stored state (currently rejected as missing). Decide before v1.1.6 ships so the semantics across two releases stay consistent.
|
||
- **Orphan file sweeper** — design + ship the simple `*.tmp.*` sweeper (defer the full DB-cross-check version to v1.3+). v1.1.6 is when the file storage will start to accumulate enough that operators notice.
|
||
4. **Awareness:** v1.1.5 is the first release where the CI workflow exists. If the project lands new contributors before v1.1.6, the workflow needs `secrets` review (none currently set) and possibly branch-protection rules pointing at the CI checks.
|
||
|
||
Branch is ready for merge. Verdict: **APPROVE**.
|