# v1.1.4 Handback — Outbound HTTP SDK & Cron Triggers **Branch:** `feat/v1.1.4-http-cron` (off `main`) **Commits:** 1 implementation commit (`feat(v1.1.4): outbound HTTP SDK + cron triggers`) + this HANDBACK commit. > **Note on commit granularity:** the brief suggested split > `feat(v1.1.4-http)` / `feat(v1.1.4-cron)` commits. The two features are > interleaved across shared files (`Cargo.toml`, `crates/picloud/src/lib.rs`, > `crates/manager-core/src/lib.rs`, `version.rs`, `services.rs`), so > cleanly-*compiling* per-theme commits aren't separable without interactive > hunk staging (unavailable in this environment). I chose one coherent, > green commit over shipping broken intermediates. Squash/relabel as you see fit. --- ## Scope coverage | # | Item | Status | |---|------|--------| | 1 | `http::*` SDK surface (get/post/put/patch/delete/head/post_form/request) | **Done** | | 2 | SSRF deny-list (resolved-IP, DNS-rebinding defense, scheme/port, body caps, UA, timeouts, `PICLOUD_HTTP_ALLOW_PRIVATE`) | **Done** | | 3 | http authz (`Capability::AppHttpRequest` → `script:write`, script-as-gate, no new Scope) | **Done** | | 4 | `HttpService` trait + `HttpServiceImpl` + Services wiring | **Done** | | 5 | Cron migration `0017` (Layout-E extension) | **Done** | | 6 | Cron scheduler tokio task (catch-up = fire-once) | **Done** | | 7 | `ctx.event.cron` shape + `TriggerEvent::Cron` | **Done** | | 8 | Dispatcher routing extension (`… \| Cron`) | **Done** | | 9 | Dashboard cron trigger UI (minimal) | **Done** | | 10a | Redact `ModuleSourceError::Backend` at resolver boundary | **Done** | | 10b | Pin `rhai = "=1.24"` | **Done** | | 10c | CHANGELOG retroactive v1.1.3 cross-app-trigger security note | **Done** | | 11 | Version bumps (workspace 1.1.4, SDK 1.5, dashboard 0.10.0) | **Done** | | 12 | Tests (~50-70) | **Done** — 70 new | --- ## SSRF policy implementation notes - **reqwest hook.** `crates/manager-core/src/ssrf.rs` defines `SsrfResolver` implementing `reqwest::dns::Resolve`, plugged in via `ClientBuilder::dns_resolver`. It delegates to the system resolver (injectable for tests — see DNS-rebinding test), then filters each `IpAddr` through `SsrfPolicy::check`. Because reqwest re-resolves at every connection (including each redirect hop), the policy applies post-redirect too. - **`dns_resolver` is generic over a concrete `R: Resolve`** (stores `Arc`), so the resolver is passed as `Arc`, not `Arc`. - **Literal-IP gap closed.** reqwest only routes *hostnames* through the custom resolver — a URL with a literal-IP host (`http://127.0.0.1/`) bypasses it entirely. The impl therefore *also* runs `SsrfPolicy::check` on literal-IP hosts at URL-parse time (`validate_url`), on every hop. Both paths are tested. - **IPv4-mapped IPv6 re-check.** `check_v6` calls `Ipv6Addr::to_ipv4_mapped()`; if `Some`, it re-runs the v4 deny-list against the embedded address (`::ffff:127.0.0.1` → denied as "loopback"). - **Applied before AND after redirects.** Redirects are followed *manually* (client built with `redirect(Policy::none())`) so per-request `follow_redirects`/`max_redirects` are honored; each hop re-validates scheme/port + literal-IP and re-resolves hostnames through the SSRF resolver. - **Script-visible error format.** `"http: blocked by SSRF policy: "` where `` is a CIDR category (`loopback`, `private`, `link-local`, `carrier-grade-nat`, `multicast`, `reserved`, `unique-local`, `unspecified`). **The resolved IP is never included.** The all-addresses-denied case surfaces as `Ssrf` (not a generic DNS error) via a marker error the resolver emits and the impl detects by walking the reqwest error source chain. ## Cron scheduler implementation notes - **Catch-up = fire-once.** Matches the brief; no deviation. `next_due` returns a single canonical scheduled-at (first slot after `last_fired_at`, or `created_at` if never fired); after firing, `last_fired_at = now`, so the next tick sees only future slots. **Verified live** against Postgres: an every-second (`* * * * * *`) trigger with a 2s tick advanced `last_fired_at` ~once per 2s, not once per second. - **No ExecutionGate contention.** The scheduler only enqueues to the outbox (one row per due trigger per tick, in a `FOR UPDATE OF d SKIP LOCKED` transaction that also bumps `last_fired_at`). The existing dispatcher acquires the gate and delivers it identically to kv/docs/dead_letter — verified live (the cron outbox row was consumed, the script executed, the row deleted). - **Timezone handling.** `chrono-tz`. Invalid IANA names are rejected at the admin endpoint with a 422 (`TriggersApiError::Invalid`, message contains "timezone"); the repo re-validates defensively before insert. - **Schema beyond the brief:** none. Followed the brief exactly — `schedule`, `timezone DEFAULT 'UTC'`, `last_fired_at`, `idx_cron_triggers_due`. **No** stored `next_scheduled_at` column (an exploration agent suggested one; the brief computes next-fire in-process, which I followed). --- ## Tests added (70 new) - **SSRF policy + resolver (`ssrf.rs`, 20):** one per deny CIDR (127/8, 0/8, 10/8, 172.16/12, 192.168/16, 169.254/16 incl. metadata, 100.64/10, 224/4, 240/4, ::1, ::, fe80::/10, fc00::/7, ff00::/8); 172.x outside-range allowed; public v4/v6 allowed; IPv4-mapped re-check; `allow_private` disables all; resolver returns only allowed addrs; all-denied → SSRF marker; **DNS rebinding** (mock resolver: public then private — second denied); empty resolution ≠ SSRF. - **HTTP client (`http_service.rs`, 16):** GET/POST round-trips vs a hand-rolled `TcpListener`; body dispatch + default UA; custom UA override; empty body; non-2xx no-error; response cap via Content-Length; response cap mid-stream (no Content-Length); request body cap pre-send; redirect-to-max-then-throw; scheme rejection (file/ftp/gopher); port rejection (22/25/465/587); SSRF literal-loopback; SSRF hostname-resolves-to-loopback; timeout; authz (anon skips / member forbidden / member-with-role allowed). - **Bridge integration (`sdk_http.rs`, 15):** real Rhai engine under `spawn_blocking` vs a recording fake — status+JSON body, non-JSON string, empty→`()`, Map→JSON, String→text, `()`→no body, headers+timeout forwarded, unknown opt key throws, timeout>max throws, non-2xx no-throw, network error throws `http:`, `post_form` url-encoding, `request` arbitrary method, default-UA carries `script_id`, `cx.app_id` forwarded for attribution. - **Cron scheduler (`cron_scheduler.rs`, 11):** 6-field schedule accept / 5-field + malformed reject; IANA tz accept / reject; due/not-due; never-fired uses created_at; **catch-up fires exactly once after 5 missed windows**; timezone affects fire time; bad schedule/tz → None. - **Cron admin (`triggers_api.rs`, 6):** create succeeds; invalid schedule; unknown timezone; **module target rejected** (v1.1.3 regression); **cross-app target rejected** (v1.1.3 regression); member-without-role forbidden. - **Module redaction (2):** `modules.rs` — backend error redacted from the script-visible error (no leak); `module_redaction_logging.rs` — original error **is** logged at ERROR level (captured via a global tracing subscriber). --- ## Decisions beyond the brief (every prompt-default deviation, flagged) 1. **Three-arg split `verb(url, body, opts)`** (user-approved during planning). Diverges from the brief's documented two-arg `(url, opts)` shape and generalizes the escape hatch to `request(method, url, body, opts)`. Resolves the brief's internal contradiction (its Slack example `http::post(url, #{text:...})` passed a bare body map, which would be an "unknown opt key" under the two-arg rule). The `opts` vocabulary is now exactly `{headers, timeout_ms, follow_redirects, max_redirects}` — **`body_raw` was dropped** (raw strings go through the positional body as a String). The Slack example works unchanged (`#{text:...}` is the body). 2. **Cron crate = `cron` (0.12), not `croner`.** The brief allowed either; `cron` handles the 6-field-with-seconds format and named weekdays (`MON-FRI`) used in the brief's example, and integrates with chrono `Schedule::after`. 3. **Catch-up = fire-once** — matches the brief; called out explicitly as requested. No deviation. 4. **`SdkCallCx` gained a `script_id` field.** The brief's default User-Agent is `picloud/ (script:)`, but `SdkCallCx` didn't carry the script id. Adding it (sourced from `ExecRequest.script_id` in the engine) is the clean home and doubles as the audit-attribution key the brief emphasizes. All 19 construction sites updated. The dead-letters admin cx uses a fresh sentinel id (no script executes there). 5. **SSRF also blocks IPv6 unspecified `::` and IPv4 `0.0.0.0`** with reason "unspecified". `0.0.0.0/8` is in the brief's list; `::` is not explicitly but is an obvious sibling hole, so I blocked it too (defensible superset). 6. **No reqwest feature additions needed** — `dns_resolver` and `Response::chunk()` compile under the existing `default-features = false, features = ["json","rustls-tls"]`. No cookie jar (cookies feature is off, so there's no jar to disable). Added `url` as an executor-core dep (for `form_urlencoded` in `post_form`). --- ## How to verify locally (§8 attestation — run on this exact HEAD) All four gates were run on the handback HEAD (the `feat(v1.1.4)` commit, before this markdown commit): ``` cargo fmt --all -- --check → exit 0 cargo clippy --all-targets --all-features -- -D warnings → exit 0 cargo test --workspace → 427 passed, 0 failed (cd dashboard && npm run check) → 0 errors, 0 warnings (369 files) ``` This HANDBACK commit is pure markdown (no gate-relevant files), so the numbers above hold for the final HEAD. **Migrations — verified against a real Postgres (dev stack, port 15432):** - Fresh-DB replay: the `#[sqlx::test]` schema-snapshot test applies all migrations on a fresh ephemeral DB and matches the (re-blessed) golden — passes. - On-top-of-prior-state: booting `picloud` against a dev DB pinned at migration `0006` applied `0007…0017` cleanly (`"migrations applied"`); `_sqlx_migrations` max is now `17`; `cron_trigger_details` + widened CHECKs present. **Live smoke performed:** - Boot logged the `PICLOUD_HTTP_ALLOW_PRIVATE` warning and started the cron scheduler + HTTP service without panic. - Seeded an every-second cron trigger → scheduler set `last_fired_at`, dispatcher consumed the outbox row and ran the script (row deleted on success), and `last_fired_at` advanced at the tick cadence (fire-once confirmed). Smoke data cleaned up afterward. - HTTP GET / SSRF-block / body-dispatch behaviors are covered by the automated integration tests (real `TcpListener` round-trips + loopback/hostname SSRF blocks) rather than a manual curl flow, since a live SSRF-block smoke conflicts with the `PICLOUD_HTTP_ALLOW_PRIVATE` a local-server smoke requires. To re-run the schema snapshot: ``` docker compose up -d postgres DATABASE_URL=postgres://picloud:picloud@127.0.0.1:15432/picloud \ cargo test -p picloud-manager-core --test schema_snapshot -- --include-ignored ``` --- ## ⚠️ Latent issue found: stale schema-snapshot golden `crates/manager-core/tests/expected_schema.txt` was **significantly stale** — the committed golden was missing many tables from prior releases (`abandoned_executions`, `dead_letters`, `dead_letter_trigger_details`, `docs_*`, etc.). The `schema_snapshot` test is `#[ignore]` (needs a DB), so it was apparently never re-blessed across v1.1.1–v1.1.3 and silently drifted. I re-blessed it, so the diff is large (+217 lines) but **only `cron_trigger_details` + the two widened CHECK constraints are v1.1.4-new** — the rest is pre-existing drift correction. The blessed golden now matches a clean replay (verified). Recommend the reviewer skim the diff to confirm, and consider whether the `#[ignore]` should be lifted in CI (with a DB service) so the golden can't drift again. --- ## Latent security findings None new beyond the (already-known, already-closed-in-v1.1.3) cross-app trigger gap, which §10c now documents in the CHANGELOG. The SSRF surface is the main security mechanism in this release; see the SSRF notes above for the defense-in-depth layering (resolver hook + literal-IP check + per-hop re-validation + IP-never-leaked errors). One thing for the reviewer to weigh: the SSRF policy is a hardcoded deny-list with no per-app allow-list (deferred to v1.2 per the brief). An operator who needs a script to reach a private service has only the all-or-nothing `PICLOUD_HTTP_ALLOW_PRIVATE` global escape hatch today. ## Open questions for the reviewer 1. **Three-arg HTTP shape** (decision #1) — confirm you're happy with `verb(url, body, opts)` + dropping `body_raw`, vs the brief's documented two-arg form. This is the one user-facing API-shape divergence. 2. **Stale schema golden** — OK to land the full re-bless in this PR, or would you prefer the drift correction split out? ## Deferred items (per the brief's OUT list — not built) WebSocket/SSE, streaming responses, HTTP/3, per-app outbound allow/deny lists, per-app rate limits, mTLS, request signing, cookie jar, cron backfill replay, cron next-fire preview, cron schedule history, drift compensation, module-import-over-HTTP, files/pubsub/secrets/email/users/queue. ## Known limitations / rough edges - The dashboard Triggers tab lists all trigger kinds but only *creates* cron triggers (kv/docs creation remains API-only, unchanged from before). No next-fire-at preview (deferred to v1.2). - `post_form` / body field order follows Rhai map iteration order (`BTreeMap`-backed, so sorted/deterministic; not insertion order). - The cron scheduler tick is floored at 1s; sub-second schedules effectively fire at the tick cadence (by design — see the fire-once policy). - The stale REVIEW.md at repo root is the v1.1.3 reviewer's artifact; the v1.1.4 reviewer should overwrite it.