# HANDBACK — v1.1.5 Files & Pub/Sub ## §1 Branch + commits - **Branch:** `feat/v1.1.5-files-pubsub` (off `main`). Not pushed, not merged, no PR. - **Commits:** the two-feature split decided in planning + a finalize commit; HANDBACK is the 4th (docs): 1. `6e132b6 feat(v1.1.5): files SDK + files:* triggers` 2. `834c787 feat(v1.1.5): pubsub::publish_durable SDK + pubsub:* triggers` 3. `4595db7 chore(v1.1.5): version bumps, CI workflow, schema-snapshot un-ignore` 4. `docs(v1.1.5): handback report` (this file) Each of commits 1–3 is independently green (fmt + clippy + `cargo test --workspace`). Shared files (Cargo deps, `Services` bundle, `version.rs`, dispatcher arm, authz enum, CHANGELOG) are touched in both feature commits as planned — additive only, so commit 1 compiles green with the `AppPubsubPublish` capability and the dashboard `'pubsub'` type union present-but-unused until commit 2. ## §2 Scope coverage | Brief item | Status | Notes | |---|---|---| | §1 `files::*` SDK | ✅ | `create/head/get/update/delete/list`, blob in/out, metadata maps, throw-vs-`()` convention. | | §1 migration 0018_files.sql | ✅ | metadata table + `idx_files_app_collection`. Bytes on disk, never in PG. | | §1 atomic writes/deletes, checksum, size+name+type caps, authz, events | ✅ | See §3. | | §2 `files:*` trigger (Layout-E, 0019) | ✅ | widen 2 CHECKs + `files_trigger_details`; `TriggerEvent::Files` (metadata only); admin `POST /triggers/files`; `emit_files`; dispatcher arm. | | §3 `pubsub::publish_durable` SDK | ✅ | publish-time transactional fan-out; topic matching in Rust; succeed-silently on no match. | | §4 `pubsub:*` trigger (Layout-E, 0020) | ✅ | widen 2 CHECKs + `pubsub_trigger_details` + partial index; `TriggerEvent::Pubsub`; admin `POST /triggers/pubsub`; dispatcher arm. | | §5 dashboard Files view | ✅ | `apps/[slug]/files/+page.svelte` (list per collection, per-row delete w/ confirm). Backed by a new admin files API (§7.2). | | §5 dashboard Pub/Sub trigger form | ✅ | added to the Triggers tab beside Cron; trigger-list renders files + pubsub. `npm run check` clean. | | §6 schema_snapshot CI follow-up | ✅ | §6b skip-when-absent + un-ignore; §6a new `.github/workflows/ci.yml`. See §5. | | §7 version bumps | ✅ | workspace 1.1.4→1.1.5, SDK 1.5→1.6, dashboard 0.10.0→0.11.0, CHANGELOG, CLAUDE.md env table. | | §8 tests | ⚠️ | 63 new tests (target 70–90). Every *named* critical test covered; gap is the dispatcher end-to-end DB test (see §9.2). | ## §3 Files implementation notes **Service layering** (`FilesServiceImpl`, manager-core): validate collection (empty + traversal) → script-as-gate authz (`AppFilesRead`/`AppFilesWrite`, skipped when `cx.principal` is `None`) → field/size-cap validation → repo call keyed by `cx.app_id` → best-effort `ServiceEvent` emit. `executor-core` has **no** Postgres or filesystem dependency — both traits live in `picloud-shared`, the impl in manager-core. **Atomic-write protocol** (`write_atomic_at`, a free fn so it's unit-testable without a pool): 1. Validate collection path-safety (defensive — already enforced at the SDK boundary). 2. `create_dir_all` the shard dir `/files////` with `0o700` (Unix `DirBuilderExt::mode`). 3. SHA-256 the in-memory bytes (single pass — never re-reads the file) while writing to `.tmp.-`. 4. `sync_all()` the temp file. 5. `rename(tmp, final)` — atomic on POSIX. 6. `sync_all()` the parent dir (rename durability). 7. INSERT/UPDATE the DB row. Rollback per step: crash in 1–5 → orphan `*.tmp.*` (never read; the pid+counter suffix avoids collisions); crash in 5–7 → bytes with no row, **never reachable via the SDK** because every read starts from the row. `update` reads the prior row first (existence + CDC `prev`), writes new bytes, then UPDATEs. **Atomic-delete protocol** (`FsFilesRepo::delete`): `SELECT … FOR UPDATE` + `DELETE` in one transaction → commit → `unlink` outside the tx. Unlink failure leaves an orphan (logged at warn); failure before commit changes nothing. Returns the deleted metadata so the service can emit. **Path-traversal validation:** `picloud_shared::validate_files_collection` rejects empty / `/` / `\` / `..` / NUL at the SDK boundary; `FsFilesRepo::guard_collection` repeats it before any fs op. UUID ids can't produce traversal (verified defensively). **Per-call SHA-256:** computed once over the in-memory `Vec` during the write (`sha2::Sha256`), hex-lowercased, stored on the row. The file is never re-read to hash. Known-vector tests pin `SHA-256("abc")` and `SHA-256("")`. **Checksum-on-get:** `get` reads the file, re-hashes, compares to the stored checksum. Mismatch (or missing bytes while the row persists) → `FilesError::Corrupted`, logged at error level with the path, **no auto-delete**. To scripts this surfaces as a thrown Rhai error `"files: file content corrupted (checksum mismatch)"`. ## §4 Pub/Sub implementation notes **Fan-out-at-publish-time, transactional** (`PostgresPubsubRepo::fan_out_publish`): one transaction — `SELECT` all enabled pubsub triggers for the app (joined to `pubsub_trigger_details`), filter by `topic_matches` in Rust, INSERT one `outbox` row (`source_kind='pubsub'`) per survivor, commit once. A mid-fan-out failure rolls back every row (no half-fan-out). Each delivery row then retries/dead-letters independently through the unchanged dispatcher (its trigger arm just gained `| OutboxSourceKind::Pubsub`). **Topic pattern matching** runs in Rust (`picloud_shared::topic_matches`), not SQL: `"*"` → all; `".*"` → `starts_with(".")`; otherwise exact. `validate_topic_pattern` (used at trigger creation in the admin endpoint and defensively in the repo) accepts only `*` / `.*` / no-star-exact, rejecting `*.created`, `**`, `a.*.b`, `user.*x`, etc. with `"unsupported pubsub topic pattern: …"`. **No matching trigger → the publish succeeds, zero outbox rows** (the design-notes-preferred succeed-silently). `published_at` is stamped manager-side (`Utc::now()`) so every delivery agrees on one instant. `ctx.event.pubsub = #{ topic, message, published_at }`, `ctx.event.op = "publish"`. There is **no `list_matching_pubsub` on `TriggerRepo`** — pubsub publishes directly (it's not a `ServiceEvent`), so the fan-out SELECT lives in `pubsub_repo`, not the `OutboxEventEmitter`. This is the one structural asymmetry vs files/kv/docs, intentional per the publish-time-fan-out decision. ## §5 CI follow-up (§6) status - **Pre-existing CI:** none (no `.github/`, no `.gitlab-ci.yml`). - **§6a (added):** `.github/workflows/ci.yml` — a `rust` job with a `postgres:15` service (`DATABASE_URL=postgres://picloud:picloud@localhost:5432/picloud`) running `cargo fmt --all -- --check`, `cargo clippy --all-targets --all-features -- -D warnings`, `cargo test --workspace`; a separate `dashboard` job running `npm ci` + `npm run check`. - **§6b (done):** `schema_snapshot.rs` is no longer `#[ignore]`'d. Reworked from `#[sqlx::test]` to `#[tokio::test]` that **skips cleanly when `DATABASE_URL` is unset** (chosen over fail-loud so `cargo test --workspace` stays green locally) and otherwise connects, runs `sqlx::migrate!`, and dumps. Golden `expected_schema.txt` re-blessed (now contains `files`, `files_trigger_details`, `pubsub_trigger_details`, both widened CHECKs, `idx_files_app_collection`, `idx_triggers_app_pubsub_enabled`, and migrations 0018–0020). - **Tradeoff (documented):** the non-`sqlx::test` path applies migrations against the `DATABASE_URL` database directly rather than an isolated throwaway DB. Migrations are forward-only/idempotent and CI's Postgres is fresh, so the structural dump is identical; locally it will also apply 0018–0020 to whatever DB you point at. ## §6 Schema decisions beyond the brief - `files` table is verbatim from the brief. `files_trigger_details` / `pubsub_trigger_details` mirror `kv_trigger_details` / `cron_trigger_details`. - `pubsub_trigger_details` has no `ops` column (a publish has a single implicit op) — only `topic_pattern`. - `idx_triggers_app_pubsub_enabled` is the third partial index of its kind (per the brief's note); deliberate duplication. ## §7 Decisions beyond the brief (every prompt-default deviation) 1. **Empty blob treated as a missing `data` field.** `NewFile::validate` / `FileUpdate::validate` reject 0-byte `data` with `FilesError::MissingField("data")`. The brief lists `data` as required and tests "missing … data"; the cleanest testable interpretation at the service layer is "empty == missing". Consequence: v1.1.5 cannot store an intentionally-empty file. Easy to relax later. 2. **Admin files REST API added** (`files_api.rs`: `GET /apps/{id}/files?collection=…`, `DELETE /apps/{id}/files/{collection}/{file_id}`). The brief's §5 dashboard needs a backend but didn't spell out admin endpoints; I added a minimal one mirroring `triggers_api`'s direct-repo + capability pattern (`AppFilesRead` for list, `AppFilesWrite` for delete). 3. **Admin file delete does NOT emit a `files:delete` trigger event.** It's an operator cleanup action, not a script mutation, so it goes straight to the repo. SDK deletes still emit. Flagging because "every successful mutation emits" could be read to include admin deletes. 4. **Files `list` bridge accepts both positional and map forms** — `list()`, `list(cursor)`, `list(cursor, limit)`, and `list(#{ cursor, limit })` (the map form the brief's example used). Additive convenience. 5. **Files collection-glob semantics reuse the existing `collection_matches`** (`*` / `foo*` prefix / exact), identical to kv/docs. The brief mentioned a `"prefix:*"` form in one spot; I kept parity with the established kv/docs matcher rather than introduce a new glob dialect. 6. **schema_snapshot runs against the live `DATABASE_URL` DB** rather than an isolated temp DB (see §5). 7. **Orphan sweep deferred to v1.1.6+** — confirmed with the user during planning (the brief's recommended default). No `*.tmp.*` sweeper daemon shipped. ## §8 How to verify locally — attestation (fresh run on HEAD `4595db7`) ``` cargo fmt --all -- --check → exit 0 cargo clippy --all-targets --all-features -- -D warnings → exit 0 cargo test --workspace → 491 passed, 0 failed (exit 0) (schema_snapshot skips cleanly with no DATABASE_URL) cd dashboard && npm run check → 0 errors, 0 warnings (exit 0) ``` With a live Postgres (the schema guardrail actually verifies the schema): ``` DATABASE_URL=postgres://picloud:picloud@127.0.0.1:15432/picloud \ cargo test -p picloud-manager-core --test schema_snapshot → test result: ok. 1 passed ``` Migrations 0018–0020 applied cleanly on top of the existing v1.1.4 dev DB during the re-bless — the same `sqlx::migrate!` replay CI runs on a fresh Postgres. Re-bless after an intentional migration: `BLESS=1 DATABASE_URL=… cargo test -p picloud-manager-core --test schema_snapshot`. **Not run this session:** the full running-binary manual smoke (a script that does `files::collection("uploads").create(...)` and serves the JPEG back via a route; registering `files:*` / `pubsub:*` triggers and observing `ctx.event`). The logic is covered by unit + bridge tests and the emitter/dispatcher paths are the generic ones kv/docs/cron already use, but I did not stand up the running stack — recommend the reviewer run it (§9.2). ## §9 Open questions for the reviewer 1. **Orphan sweep** — deferred to v1.1.6+ per the planning decision. Confirm shipping v1.1.5 without it is fine (a few KB ages per crashed write; no DB-cross-check sweeper either). 2. **Test count 63 vs the 70–90 target.** Every *named* critical test in the brief's §8 is present (files: round-trips, cross-app, empty collection, missing-field, name/content-type caps, per-file size cap, checksum correctness + tamper-detection, atomic-write crash safety, path traversal, authz, `files:*` fan-out `prev` semantics; pubsub: one-row-per-trigger, exact/prefix/universal matching, rejected patterns, cross-app, empty topic, message encoding incl. blob→base64, transactional rollback, multiple matches). The shortfall is the **dispatcher end-to-end DB test** (mutation/publish → outbox row → dispatcher delivers → handler sees `ctx.event`). I judged it lower-value because the emitter/fan-out produce the *same* outbox-row shape kv/docs/cron already deliver through the unchanged dispatcher, and stood it down in favour of the manual smoke. Want a `DATABASE_URL`-gated integration test added for it? 3. **Empty-blob = missing-data** (§7.1) — acceptable, or should empty files be storable? ## §10 Latent security findings None new. Checked specifically: (a) cross-app isolation is keyed on `cx.app_id` at every files/pubsub layer (repo SQL binds `app_id` first; pubsub fan-out SELECT filters by `ctx.app_id`); tests assert app A can't see/fire app B's files/triggers. (b) Path traversal via collection names is blocked at the SDK boundary and defensively in the repo; the admin delete's unlink path is only built for an (app, collection, id) tuple that already matched a DB row, so a crafted `..` segment can't unlink arbitrary files. (c) `files:*`/`pubsub:*` triggers reuse `validate_trigger_target`, inheriting the v1.1.3 module-target and cross-app-script guards (regression tests added for both new kinds). ## §11 Deferred items (per brief Scope-OUT + orphan-sweep decision) `publish_ephemeral` (v1.2), per-app storage quotas (v1.2), file dedup (v1.2+), presigned URLs / external download tokens (v1.1.6+), streaming up/download (Rhai is sync), file-level ACLs (v1.2+), mid-pattern wildcards (v1.2), topic ACLs / external subscription / `topics` table (v1.1.6), realtime SSE (v1.1.6), and the **orphan-file sweep daemon** (v1.1.6+ — confirmed deferred). ## §12 Known limitations / rough edges - **No orphan reclamation** — crashed writes leave `*.tmp.*`; rename-completed-but-DB-failed leaves unreferenced bytes. Both are harmless (never SDK-readable) but accumulate until v1.1.6's sweeper. - **Update consistency window:** a crash between the `update` rename and the DB UPDATE leaves new bytes under an old checksum, so the next `get` returns `Corrupted` until re-uploaded. This is the brief's accepted step-5–7 window, surfaced honestly. - **Pub/sub fan-out holds one transaction across all subscribers** — fine at v1.1.x scale; a topic-trie index is the v1.2 escape hatch if it becomes a hot path. - **Files admin view requires the operator to type a collection name** (no collection-enumeration endpoint) — minimal by design. - **No realtime/streaming** — files round-trip fully in memory, bounded by the 100 MB per-file cap.