Commit Graph

5 Commits

Author SHA1 Message Date
MechaCat02
6891496589 feat(manager-core): admin auth gate (Phase 3a)
Closes the regression risk of the admin API and dashboard being open
to anyone reaching the bound port. Required foundation before v1.1
data-plane services land.

Per-user accounts (admin_users), Argon2id passwords, env-var bootstrap
of the first admin that becomes inert once any admin exists, opaque
32-byte session token doubling as bearer credential, 24h sliding TTL
configurable via PICLOUD_SESSION_TTL_HOURS. is_active column lets
admins be deactivated without losing audit history; last-active-admin
guard on DELETE and on PATCH that flips is_active to false (sessions
also wiped on deactivation).

require_admin middleware fronts every /api/v1/admin/* route. The data
plane (/api/v1/execute/{id}), /healthz, /version, and user routes
stay open. picloud admin reset-password <username> subcommand handles
recovery without going through HTTP.

Dashboard gains /admin/login and /admin/admins surfaces, a top-bar
user menu, and a token store with a localStorage echo so refreshes
don't sign you out. Cookie-based auth works in parallel for non-SPA
clients.

Forward compatibility: future RBAC tables (admin_roles,
admin_user_roles) join on admin_users.id; the auth middleware is the
seam where role checks slot in. Email, 2FA, passkeys, and personal
API tokens are all additive without touching admin_users.

Blueprint §11.4 updated to reflect what actually shipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 19:30:25 +02:00
MechaCat02
07e2a62d98 feat: custom routing — bind scripts to your own URLs
Scripts can now answer at user-chosen paths (e.g. /greet, /greet/:name,
/webhooks/*), on user-chosen hosts (strict or *.example.com wildcards),
on user-chosen methods. The internal /api/v1/execute/{id} endpoint
stays as the always-available ID-based bypass.

Routing rules (decided in design with the user; see chat history):

  Path kinds:
    exact   /greet              literal
    prefix  /greet/*            strict-subtree; stored as "/greet/";
                                does NOT match bare /greet (add an
                                exact route for that case)
    param   /users/:id          :name captures one whole segment;
                                mid-segment colons are rejected;
                                {name} is reserved for a future SDK

  Host kinds:
    any                         no Host header constraint
    strict  sub.example.com     literal match (case-insensitive)
    wildcard *.example.com      suffix match; multi-level subdomains OK

  Within-kind uniqueness:
    two routes of the same kind that could match the same request
    conflict at config time. Algorithm (orchestrator_core::routing::
    conflict):
      exact:  literal equality
      prefix: literal equality (longer-prefix coexists; longer wins
              at request time)
      param:  same segment count + same literals at every
              literal-vs-literal position (the user's example:
              :id vs :userId at same shape is a conflict)

  Request-time precedence:
    exact > param > prefix
    among non-exact: more leading-literal segments wins
    tie: param > prefix (more constrained)
    within prefix: longest matching prefix wins
    host bucket: strict > wildcard (longer suffix) > any; fall through
    to less specific buckets when path doesn't match

  Reserved path prefixes: /api/, /admin/, /healthz, /version

  Routes that look invalid at config time return 422 with the precise
  parse error; conflicting routes return 409 with the conflicting route
  in the body (so the dashboard can render the conflict inline).

What landed:

  * 0003_routes.sql — routes table (host_kind, host, host_param_name,
    path_kind, path, method, script_id) with UNIQUE index on the
    literal binding tuple. Schema 2 → 3.

  * shared::Route / HostKind / PathKind — flat storage shape that
    crosses wire boundaries cleanly.

  * orchestrator_core::routing — four sub-modules, all unit-tested:
      pattern.rs (16 tests)  parse + validate + display
      conflict.rs (12 tests) within-kind overlap predicate
      matcher.rs (12 tests)  runtime dispatch (specificity-aware)
      table.rs               Arc<RwLock<Vec<CompiledRoute>>>
                             shared by manager (writes) and
                             orchestrator (reads); atomic replace
                             after each admin write

  * manager-core::route_admin — five new admin endpoints under
    /api/v1/admin:
      POST   /scripts/{id}/routes      create
      GET    /scripts/{id}/routes      list per script
      DELETE /routes/{route_id}        delete (refreshes table)
      POST   /routes:check             pre-flight conflict check
                                       (powers the dashboard's
                                       live conflict warning)
      POST   /routes:match             synthetic URL → matched
                                       route + extracted params
                                       (powers the dashboard's
                                       match-preview tool)
    Stored path strings stay raw (user-typed); normalization
    happens only in the in-memory CompiledRoute so re-parses are
    idempotent.

  * orchestrator_core::api::user_routes_router — fallback handler
    mounted in picloud after the system routes. Reads Host /
    method / path / query from the request, dispatches via the
    table, builds an ExecRequest with params/query/rest filled,
    calls the executor, writes to the log sink. 10 MiB body cap.

  * executor-core::ctx (SDK 1.0 → 1.1) — adds
      ctx.request.params  (map of named-param captures)
      ctx.request.query   (parsed query string)
      ctx.request.rest    (suffix for prefix routes; "" otherwise)
    All three are always present (empty when not applicable) so
    scripts can read them unconditionally.

  * picloud::build_app — now async; loads routes at startup,
    populates the shared table, mounts route_admin_router under
    /api/v1/admin alongside the script CRUD, and the user-routes
    fallback at the app root.

  * caddy/Caddyfile + Caddyfile.prod widened: anything not
    /healthz, /version, /api/v1/admin/*, /api/v1/execute/*,
    /api/* (404 sunset), or /admin/* (dashboard) → picloud.

  * Dashboard moves to /admin/* via SvelteKit paths.base. Its
    internal Caddy strips the prefix and serves with SPA fallback.
    All in-app links use $app/paths. The dashboard URL is now
    http://localhost:8000/admin/ — one-time break for the new
    URL freedom users gained.

  * PICLOUD_PUBLIC_BASE_URL env var, exposed via /version so the
    dashboard renders full URLs for routes regardless of the
    operator's external port / TLS setup.

  * memory_limit_mb stays in the schema, still v1.3+ advisory.

Verified live through Caddy:
  /version              → schema 3, sdk 1.1, public_base_url
  GET /admin/           → 200, dashboard HTML containing "PiCloud"
  POST /api/v1/admin/scripts → 201
  POST .../scripts/{id}/routes (path=/greet/:name) → 201
  GET /greet/alice?lang=en → 200 {"name":"alice","q":"en"}
  POST conflicting route → 409 with conflicting_route body
  POST /admin/foo route → 422 "reserved"
  POST /api/v1/admin/routes:match → matched + params extracted
  GET /unbound-path → 404 JSON

Tests:
  * 40 routing unit tests (pattern + conflict + matcher tables)
  * 14 executor-core unit tests (one new for ctx.request.params/
    query/rest exposure)
  * 32 integration tests (10 new for routing CRUD + dispatch +
    conflict + reserved + specificity tie-break + match preview +
    delete invalidation + /version returns public_base_url)
  * default cargo test --workspace stays green; opt-in via
    DATABASE_URL + --include-ignored for the integration suite

Bumps: schema 2 → 3; SDK 1.0 → 1.1; product 0.3.0 → 0.4.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 18:18:16 +02:00
MechaCat02
f51924fdbc feat: per-script Rhai sandbox overrides with admin ceiling
Adds optional per-script overrides for the six Rhai sandbox knobs
(max_operations, max_string_size, max_array_size, max_map_size,
max_call_levels, max_expr_depth). The executor merges its defaults
with each script's overrides on every call; the manager validates
overrides against an admin-set ceiling at write time, so the
executor trusts whatever is stored.

Storage chose JSONB on the existing scripts table over six new
columns: lets future knobs land as code-only changes, keeps the
sparse common case (most scripts override nothing) cheap to store
and serialize, and matches how the manager + executor pass the
config across the wire.

  * 0002_sandbox.sql — ALTER TABLE scripts ADD COLUMN sandbox
    JSONB NOT NULL DEFAULT '{}'
  * shared::ScriptSandbox — six Option<u64> fields with
    deny_unknown_fields so typos surface as 422
  * Script.sandbox + ExecRequest.sandbox_overrides — typed end
    to end; cluster mode just serializes the same struct
  * executor-core::Limits::with_overrides — field-by-field
    replacement; tests cover the override actually tightening
    the live engine
  * manager-core::SandboxCeiling — built-in conservative
    defaults (10M ops, 1 MiB strings, 100k array/map, 128
    call/expr depth); env vars override per knob, invalid
    values warn-and-skip rather than blocking boot
  * manager-core admin API — POST/PUT accept `sandbox`; values
    above the ceiling return 422 with the specific field +
    requested + ceiling; absent or `{}` keeps platform defaults
  * picloud all-in-one — wires SandboxCeiling::from_env() into
    AdminState
  * memory_limit_mb stays in the schema, marked v1.3+ advisory
    (no enforcement until OS-level isolation lands with
    cluster-mode executors)

Verified live through Caddy:
  * /version reports schema 2, product 0.3.0
  * Script with max_operations: 500 → 507 on a 10k-iteration loop
  * Same script after PUT raising to 1M → succeeds, returns 10000
  * POST with max_operations: 1_000_000_000 → 422 (exceeds ceiling)

Tests:
  * 13 executor-core unit tests (added 2 for override semantics)
  * 20 integration tests (added 6 for sandbox CRUD + ceiling +
    unknown-field rejection + executor honoring overrides)
  * default cargo test --workspace stays green (integration tests
    remain #[ignore]'d until DATABASE_URL is set)

Bumps:
  * schema 1 → 2
  * product 0.2.0 → 0.3.0
  * SDK unchanged (scripts see nothing new)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 16:26:12 +02:00
MechaCat02
0473d295af feat: versioning scheme — lockstep crates + four independent surfaces
Establish how versions are assigned, bumped, and checked across the
five things that actually change for users: the product itself, the
Rhai SDK, the HTTP API, the database schema, and the inter-service
wire (reserved for cluster mode). Crates ship in lockstep — drift
between picloud-shared and picloud-manager-core is fiction since
they always release together — but surfaces are versioned and
checked at their natural boundaries.

  * docs/versioning.md is the authoritative reference: what gets a
    version, the per-surface compatibility rules, how each surface
    bump cascades to the product version (loose pre-1.0, strict
    post-1.0), and the five enforcement mechanisms (lockstep at
    compile time, /version at runtime, golden SDK contract tests,
    migration replay, CI guardrail).

  * shared::version exposes four constants — PRODUCT_VERSION (from
    CARGO_PKG_VERSION), SDK_VERSION ("1.0"), API_VERSION (1),
    WIRE_VERSION (1). Scripts read SDK_VERSION as ctx.sdk_version
    and can feature-detect against it.

  * Workspace inheritance: `[workspace.package] version = "0.2.0"`
    is the single point of truth; every crate uses
    `version.workspace = true`. dashboard/package.json mirrors.

  * Routes move to /api/v1/* — both control plane
    (/api/v1/admin/*) and data plane (/api/v1/execute/{id}).
    Picloud composes them via a single `/api/v{API_VERSION}` nest,
    so the next major is a copy-paste-and-bump. Caddyfile (dev and
    prod) routes /api/v1/* to picloud and 404s any other /api/*
    so old clients fail loudly instead of getting the SPA shell.
    Dashboard client + integration tests updated.

  * /healthz remains a plain "ok" string (k8s probes); /version is
    the new JSON endpoint returning every surface version in one
    place — product, sdk, api, schema (from
    manager-core::migrations::latest_version), wire.

  * Reasonable bump rationale: API path changes are breaking by
    definition, so 0.1.0 → 0.2.0 (pre-1.0 license to bump minor on
    any breaking change). SDK starts at 1.0 because scripts depend
    on it more strictly than the product depends on its internals;
    we'd rather promise SDK stability early than pull the rug.

Verified live:
  * /healthz → "ok" (plain text)
  * /version → {product:"0.2.0",sdk:"1.0",api:1,schema:1,wire:1}
  * /api/v1/admin/scripts → 200
  * /api/admin/scripts → 404 with error JSON (sunset major)
  * Script can read ctx.sdk_version → "1.0"
  * All 14 integration tests pass against new paths
  * 11 executor-core unit tests pass (added one for sdk_version
    exposure with the major.minor format invariant)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 00:31:08 +02:00
MechaCat02
777f4af628 feat: persist execution logs + dashboard detail view + integration tests
Three threads landing together because they share a public surface
(the new execution_log shape) and verifying any one in isolation
would mean re-doing the work later.

== (A) execution log persistence ==

  * shared::ExecutionLog + ExecutionStatus carry the audit-trail
    shape that flows from the orchestrator through the sink and
    back out via the manager's logs endpoint.

  * shared::ExecutionLogSink trait — abstraction the orchestrator
    writes through. In single-process MVP mode the manager's
    Postgres-backed impl is plugged in directly; in cluster mode
    (v1.3+) the orchestrator's impl will post over HTTP to the
    manager. Trait lives in `shared` so neither *-core crate has
    to know about the other.

  * manager-core::PostgresExecutionLogSink writes to the
    execution_logs table (already in the initial migration);
    PostgresExecutionLogRepository reads them back, paginated.
    AdminState now carries both a script repo and a log repo, so
    `admin_router` exposes `GET /scripts/{id}/logs?limit=&offset=`
    capped at 200 rows per page to keep the dashboard responsive.

  * orchestrator-core::DataPlaneState gains `log_sink`. The
    execute handler builds an ExecutionLog on every outcome —
    success, error, timeout, budget-exceeded — and awaits the
    sink. Sink failures are logged at warn and DO NOT mask the
    user-facing result, since "we couldn't write the audit row"
    is a separate concern from "the script ran".

  * picloud binary refactored into a lib (`build_app(pool)` is
    the seam) + thin bin shell. Same Postgres pool backs the
    script repo, the log repo, and the sink — no double pool.

== (B) dashboard ==

  * Typed API client extended with `scripts.logs(id, opts)`,
    `scripts.update/remove`, and `execute(id, body, headers)`.
    Plain `fetch` wrapper now surfaces server-side error
    messages via a typed ApiError so the UI can render them.

  * `/` — create-script form now actually creates; on success
    the list reloads. List entries link to detail.

  * `/scripts/[id]` — new detail route: source editor with save
    (calls update, version bumps); Test invoke panel that sends
    arbitrary JSON body + headers to /api/execute and shows the
    response; Recent executions panel reading from /logs with
    expandable per-row request/response/script-log views.
    Delete button with confirm. SPA-routed; Caddy serves
    `build/` with the same index.html fallback.

== (C) integration tests ==

  * crates/picloud/tests/api.rs — 14 sqlx::test cases driving
    `build_app` through an axum_test::TestServer against a fresh
    Postgres DB per test. Covers: health, full script CRUD,
    duplicate-name conflict, invalid-source rejection on both
    create and update, execute echoing the body, status+header
    passthrough, 404 on missing scripts, error-path executions
    landing in the audit log with the right status.

  * Tests are `#[ignore]` by default so plain `cargo test
    --workspace` stays green without infrastructure. Opt-in via:
    `docker compose up -d postgres && \
       DATABASE_URL=postgres://picloud:picloud@127.0.0.1:15432/picloud \
       cargo test -p picloud --test api -- --include-ignored`

Verified live through Caddy on :8000: three logged invocations
land in the logs endpoint with the right structured `data` on
each `log::info`/`log::warn`, error-path executions are still
captured with status=error, dashboard list + SPA detail route
both reachable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 00:16:32 +02:00