14 KiB
v1.1.4 Handback — Outbound HTTP SDK & Cron Triggers
Branch: feat/v1.1.4-http-cron (off main)
Commits: 1 implementation commit (feat(v1.1.4): outbound HTTP SDK + cron triggers) + this HANDBACK commit.
Note on commit granularity: the brief suggested split
feat(v1.1.4-http)/feat(v1.1.4-cron)commits. The two features are interleaved across shared files (Cargo.toml,crates/picloud/src/lib.rs,crates/manager-core/src/lib.rs,version.rs,services.rs), so cleanly-compiling per-theme commits aren't separable without interactive hunk staging (unavailable in this environment). I chose one coherent, green commit over shipping broken intermediates. Squash/relabel as you see fit.
Scope coverage
| # | Item | Status |
|---|---|---|
| 1 | http::* SDK surface (get/post/put/patch/delete/head/post_form/request) |
Done |
| 2 | SSRF deny-list (resolved-IP, DNS-rebinding defense, scheme/port, body caps, UA, timeouts, PICLOUD_HTTP_ALLOW_PRIVATE) |
Done |
| 3 | http authz (Capability::AppHttpRequest → script:write, script-as-gate, no new Scope) |
Done |
| 4 | HttpService trait + HttpServiceImpl + Services wiring |
Done |
| 5 | Cron migration 0017 (Layout-E extension) |
Done |
| 6 | Cron scheduler tokio task (catch-up = fire-once) | Done |
| 7 | ctx.event.cron shape + TriggerEvent::Cron |
Done |
| 8 | Dispatcher routing extension (… | Cron) |
Done |
| 9 | Dashboard cron trigger UI (minimal) | Done |
| 10a | Redact ModuleSourceError::Backend at resolver boundary |
Done |
| 10b | Pin rhai = "=1.24" |
Done |
| 10c | CHANGELOG retroactive v1.1.3 cross-app-trigger security note | Done |
| 11 | Version bumps (workspace 1.1.4, SDK 1.5, dashboard 0.10.0) | Done |
| 12 | Tests (~50-70) | Done — 70 new |
SSRF policy implementation notes
- reqwest hook.
crates/manager-core/src/ssrf.rsdefinesSsrfResolverimplementingreqwest::dns::Resolve, plugged in viaClientBuilder::dns_resolver. It delegates to the system resolver (injectable for tests — see DNS-rebinding test), then filters eachIpAddrthroughSsrfPolicy::check. Because reqwest re-resolves at every connection (including each redirect hop), the policy applies post-redirect too. dns_resolveris generic over a concreteR: Resolve(storesArc<R>), so the resolver is passed asArc<SsrfResolver>, notArc<dyn Resolve>.- Literal-IP gap closed. reqwest only routes hostnames through the custom
resolver — a URL with a literal-IP host (
http://127.0.0.1/) bypasses it entirely. The impl therefore also runsSsrfPolicy::checkon literal-IP hosts at URL-parse time (validate_url), on every hop. Both paths are tested. - IPv4-mapped IPv6 re-check.
check_v6callsIpv6Addr::to_ipv4_mapped(); ifSome, it re-runs the v4 deny-list against the embedded address (::ffff:127.0.0.1→ denied as "loopback"). - Applied before AND after redirects. Redirects are followed manually
(client built with
redirect(Policy::none())) so per-requestfollow_redirects/max_redirectsare honored; each hop re-validates scheme/port + literal-IP and re-resolves hostnames through the SSRF resolver. - Script-visible error format.
"http: blocked by SSRF policy: <reason>"where<reason>is a CIDR category (loopback,private,link-local,carrier-grade-nat,multicast,reserved,unique-local,unspecified). The resolved IP is never included. The all-addresses-denied case surfaces asSsrf(not a generic DNS error) via a marker error the resolver emits and the impl detects by walking the reqwest error source chain.
Cron scheduler implementation notes
- Catch-up = fire-once. Matches the brief; no deviation.
next_duereturns a single canonical scheduled-at (first slot afterlast_fired_at, orcreated_atif never fired); after firing,last_fired_at = now, so the next tick sees only future slots. Verified live against Postgres: an every-second (* * * * * *) trigger with a 2s tick advancedlast_fired_at~once per 2s, not once per second. - No ExecutionGate contention. The scheduler only enqueues to the outbox
(one row per due trigger per tick, in a
FOR UPDATE OF d SKIP LOCKEDtransaction that also bumpslast_fired_at). The existing dispatcher acquires the gate and delivers it identically to kv/docs/dead_letter — verified live (the cron outbox row was consumed, the script executed, the row deleted). - Timezone handling.
chrono-tz. Invalid IANA names are rejected at the admin endpoint with a 422 (TriggersApiError::Invalid, message contains "timezone"); the repo re-validates defensively before insert. - Schema beyond the brief: none. Followed the brief exactly —
schedule,timezone DEFAULT 'UTC',last_fired_at,idx_cron_triggers_due. No storednext_scheduled_atcolumn (an exploration agent suggested one; the brief computes next-fire in-process, which I followed).
Tests added (70 new)
- SSRF policy + resolver (
ssrf.rs, 20): one per deny CIDR (127/8, 0/8, 10/8, 172.16/12, 192.168/16, 169.254/16 incl. metadata, 100.64/10, 224/4, 240/4, ::1, ::, fe80::/10, fc00::/7, ff00::/8); 172.x outside-range allowed; public v4/v6 allowed; IPv4-mapped re-check;allow_privatedisables all; resolver returns only allowed addrs; all-denied → SSRF marker; DNS rebinding (mock resolver: public then private — second denied); empty resolution ≠ SSRF. - HTTP client (
http_service.rs, 16): GET/POST round-trips vs a hand-rolledTcpListener; body dispatch + default UA; custom UA override; empty body; non-2xx no-error; response cap via Content-Length; response cap mid-stream (no Content-Length); request body cap pre-send; redirect-to-max-then-throw; scheme rejection (file/ftp/gopher); port rejection (22/25/465/587); SSRF literal-loopback; SSRF hostname-resolves-to-loopback; timeout; authz (anon skips / member forbidden / member-with-role allowed). - Bridge integration (
sdk_http.rs, 15): real Rhai engine underspawn_blockingvs a recording fake — status+JSON body, non-JSON string, empty→(), Map→JSON, String→text,()→no body, headers+timeout forwarded, unknown opt key throws, timeout>max throws, non-2xx no-throw, network error throwshttp:,post_formurl-encoding,requestarbitrary method, default-UA carriesscript_id,cx.app_idforwarded for attribution. - Cron scheduler (
cron_scheduler.rs, 11): 6-field schedule accept / 5-field- malformed reject; IANA tz accept / reject; due/not-due; never-fired uses created_at; catch-up fires exactly once after 5 missed windows; timezone affects fire time; bad schedule/tz → None.
- Cron admin (
triggers_api.rs, 6): create succeeds; invalid schedule; unknown timezone; module target rejected (v1.1.3 regression); cross-app target rejected (v1.1.3 regression); member-without-role forbidden. - Module redaction (2):
modules.rs— backend error redacted from the script-visible error (no leak);module_redaction_logging.rs— original error is logged at ERROR level (captured via a global tracing subscriber).
Decisions beyond the brief (every prompt-default deviation, flagged)
- Three-arg split
verb(url, body, opts)(user-approved during planning). Diverges from the brief's documented two-arg(url, opts)shape and generalizes the escape hatch torequest(method, url, body, opts). Resolves the brief's internal contradiction (its Slack examplehttp::post(url, #{text:...})passed a bare body map, which would be an "unknown opt key" under the two-arg rule). Theoptsvocabulary is now exactly{headers, timeout_ms, follow_redirects, max_redirects}—body_rawwas dropped (raw strings go through the positional body as a String). The Slack example works unchanged (#{text:...}is the body). - Cron crate =
cron(0.12), notcroner. The brief allowed either;cronhandles the 6-field-with-seconds format and named weekdays (MON-FRI) used in the brief's example, and integrates with chronoSchedule::after. - Catch-up = fire-once — matches the brief; called out explicitly as requested. No deviation.
SdkCallCxgained ascript_idfield. The brief's default User-Agent ispicloud/<v> (script:<script_id>), butSdkCallCxdidn't carry the script id. Adding it (sourced fromExecRequest.script_idin the engine) is the clean home and doubles as the audit-attribution key the brief emphasizes. All 19 construction sites updated. The dead-letters admin cx uses a fresh sentinel id (no script executes there).- SSRF also blocks IPv6 unspecified
::and IPv40.0.0.0with reason "unspecified".0.0.0.0/8is in the brief's list;::is not explicitly but is an obvious sibling hole, so I blocked it too (defensible superset). - No reqwest feature additions needed —
dns_resolverandResponse::chunk()compile under the existingdefault-features = false, features = ["json","rustls-tls"]. No cookie jar (cookies feature is off, so there's no jar to disable). Addedurlas an executor-core dep (forform_urlencodedinpost_form).
How to verify locally (§8 attestation — run on this exact HEAD)
All four gates were run on the handback HEAD (the feat(v1.1.4) commit, before
this markdown commit):
cargo fmt --all -- --check → exit 0
cargo clippy --all-targets --all-features -- -D warnings → exit 0
cargo test --workspace → 427 passed, 0 failed
(cd dashboard && npm run check) → 0 errors, 0 warnings (369 files)
This HANDBACK commit is pure markdown (no gate-relevant files), so the numbers above hold for the final HEAD.
Migrations — verified against a real Postgres (dev stack, port 15432):
- Fresh-DB replay: the
#[sqlx::test]schema-snapshot test applies all migrations on a fresh ephemeral DB and matches the (re-blessed) golden — passes. - On-top-of-prior-state: booting
picloudagainst a dev DB pinned at migration0006applied0007…0017cleanly ("migrations applied");_sqlx_migrationsmax is now17;cron_trigger_details+ widened CHECKs present.
Live smoke performed:
- Boot logged the
PICLOUD_HTTP_ALLOW_PRIVATEwarning and started the cron scheduler + HTTP service without panic. - Seeded an every-second cron trigger → scheduler set
last_fired_at, dispatcher consumed the outbox row and ran the script (row deleted on success), andlast_fired_atadvanced at the tick cadence (fire-once confirmed). Smoke data cleaned up afterward. - HTTP GET / SSRF-block / body-dispatch behaviors are covered by the automated
integration tests (real
TcpListenerround-trips + loopback/hostname SSRF blocks) rather than a manual curl flow, since a live SSRF-block smoke conflicts with thePICLOUD_HTTP_ALLOW_PRIVATEa local-server smoke requires.
To re-run the schema snapshot:
docker compose up -d postgres
DATABASE_URL=postgres://picloud:picloud@127.0.0.1:15432/picloud \
cargo test -p picloud-manager-core --test schema_snapshot -- --include-ignored
⚠️ Latent issue found: stale schema-snapshot golden
crates/manager-core/tests/expected_schema.txt was significantly stale — the
committed golden was missing many tables from prior releases
(abandoned_executions, dead_letters, dead_letter_trigger_details,
docs_*, etc.). The schema_snapshot test is #[ignore] (needs a DB), so it
was apparently never re-blessed across v1.1.1–v1.1.3 and silently drifted.
I re-blessed it, so the diff is large (+217 lines) but **only cron_trigger_details
- the two widened CHECK constraints are v1.1.4-new** — the rest is pre-existing
drift correction. The blessed golden now matches a clean replay (verified).
Recommend the reviewer skim the diff to confirm, and consider whether the
#[ignore]should be lifted in CI (with a DB service) so the golden can't drift again.
Latent security findings
None new beyond the (already-known, already-closed-in-v1.1.3) cross-app trigger gap, which §10c now documents in the CHANGELOG. The SSRF surface is the main security mechanism in this release; see the SSRF notes above for the defense-in-depth layering (resolver hook + literal-IP check + per-hop re-validation + IP-never-leaked errors).
One thing for the reviewer to weigh: the SSRF policy is a hardcoded deny-list
with no per-app allow-list (deferred to v1.2 per the brief). An operator who
needs a script to reach a private service has only the all-or-nothing
PICLOUD_HTTP_ALLOW_PRIVATE global escape hatch today.
Open questions for the reviewer
- Three-arg HTTP shape (decision #1) — confirm you're happy with
verb(url, body, opts)+ droppingbody_raw, vs the brief's documented two-arg form. This is the one user-facing API-shape divergence. - Stale schema golden — OK to land the full re-bless in this PR, or would you prefer the drift correction split out?
Deferred items (per the brief's OUT list — not built)
WebSocket/SSE, streaming responses, HTTP/3, per-app outbound allow/deny lists, per-app rate limits, mTLS, request signing, cookie jar, cron backfill replay, cron next-fire preview, cron schedule history, drift compensation, module-import-over-HTTP, files/pubsub/secrets/email/users/queue.
Known limitations / rough edges
- The dashboard Triggers tab lists all trigger kinds but only creates cron triggers (kv/docs creation remains API-only, unchanged from before). No next-fire-at preview (deferred to v1.2).
post_form/ body field order follows Rhai map iteration order (BTreeMap-backed, so sorted/deterministic; not insertion order).- The cron scheduler tick is floored at 1s; sub-second schedules effectively fire at the tick cadence (by design — see the fire-once policy).
- The stale REVIEW.md at repo root is the v1.1.3 reviewer's artifact; the v1.1.4 reviewer should overwrite it.