From 5546323cdc636e4710c1c3ba8246730bf0d491a8 Mon Sep 17 00:00:00 2001 From: MechaCat02 Date: Tue, 26 May 2026 21:31:25 +0200 Subject: [PATCH] =?UTF-8?q?docs(blueprint):=20add=20=C2=A711.6=20Phase=203?= =?UTF-8?q?.5=20users,=20roles,=20and=20bearer-token=20auth?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Specifies the unified can(principal, capability) gate, instance roles (owner/admin/member), per-app memberships (app_admin/editor/viewer), pic_-prefixed API keys, and the schema rooms for invites / MFA / service accounts. Updates §12 Phase 3 to add 3c as a third foundation piece alongside 3a (admin auth) and 3b (multi-app scoping). Co-Authored-By: Claude Opus 4.7 (1M context) --- serverless_cloud_blueprint.md | 152 +++++++++++++++++++++++++++++++++- 1 file changed, 150 insertions(+), 2 deletions(-) diff --git a/serverless_cloud_blueprint.md b/serverless_cloud_blueprint.md index 1aa8c5d..e795ff5 100644 --- a/serverless_cloud_blueprint.md +++ b/serverless_cloud_blueprint.md @@ -1022,6 +1022,152 @@ The scripts and routes endpoints keep their existing shape — this avoids forci --- +## 11.6 Users, roles, and bearer-token auth (Phase 3.5) — Pending + +**Status**: pending. Targets `crates/manager-core/src/{authz,api_keys_api,api_key_repo}.rs`, an extended `auth_middleware.rs`, new shared types under `crates/shared/src/auth.rs`, migration `0006_users_authz.sql`. + +**Purpose**: bridge Phase 3b → Phase 4. Phase 4's v1.1 SDKs (KV, docs, HTTP, cron) each gate access on the calling principal. Without a real authorization model in place, every SDK addition has to either invent its own gate or stay open. Phase 3.5 lands `can(principal, capability)` as the single check every future SDK + admin endpoint goes through, so v1.1 work focuses on data plane shape, not on re-litigating auth. + +**Why this slot**: same logic as Phase 3b. Adding a `Principal` parameter and a capability check to surfaces that don't exist yet is free; retrofitting them onto live SDK services after v1.1 ships is a refactor of every gate. + +### Principal Model + +One `Principal` value represents a human admin user. Service accounts (CI bots, Rhai scripts calling out) get **schema room** in this phase but no runtime support — `users.kind` style differentiation lands when Phase 4's `users.*` SDK arrives. Until then, every authenticated request resolves to exactly one admin row, whether the credential is a session cookie or a bearer API key. + +```rust +pub struct Principal { + pub user_id: UserId, // alias of AdminUserId for the transition + pub instance_role: InstanceRole, + pub scopes: Option>, // None = cookie session (full role authority) + // Some = API key (intersect with role) + pub app_binding: Option, // API key bound to one app; denies other apps +} +``` + +### Instance Roles (one per user) + +| Role | Powers | +|---|---| +| `owner` | full instance control, manage other owners, implicit `app_admin` on every app. Multiple owners allowed. | +| `admin` | create apps, invite users, implicit `editor` on every app. Cannot manage instance-wide settings or other owners. | +| `member` | invited into specific apps only. Cannot create apps, cannot invite. **Strict isolation enforced at SQL** — list endpoints `WHERE app_id IN (SELECT app_id FROM app_members WHERE user_id = $1)`; the API never returns apps a member isn't part of. | + +The current Phase 3a `admin_users` rows all become `owner` via `DEFAULT 'owner'` on the new column. Multi-owner installs get a startup `tracing::warn!` listing the active owner usernames so the operator can demote extras via `PATCH /api/v1/admin/admins/{id}`. + +### App-Scoped Roles (zero-to-many per user × app) + +| Role | Grants | +|---|---| +| `app_admin` | settings, domain claims, delete app | +| `editor` | CRUD on scripts, routes, sandbox config | +| `viewer` | read scripts + execution logs | + +Implicit grants from instance role: every `owner` is `app_admin` on every app; every `admin` is `editor` on every app. Explicit `app_members` rows are the only path for `member` users. + +### Auth Methods — Same Principal, Different Extractor + +Two credential types feed the same middleware: + +1. **Session cookie** (Phase 3a, unchanged) — `picloud_session=`. Extracted by header or cookie. SHA-256 lookup against `admin_sessions.token_hash`. Sliding 24h TTL. Produces `Principal { scopes: None, app_binding: None }`. + +2. **Bearer API key** (new) — `Authorization: Bearer pic_`. The `pic_` prefix is the discriminator: present → API key path; absent → session path. The 8 chars immediately after `pic_` are indexed (`api_keys.prefix`); the full body after `pic_` is Argon2id-verified against each candidate's `hash`. Last-used timestamp updates inline. + +Both paths converge on the same `Principal` extension; handlers cannot tell which credential was presented unless they introspect `principal.scopes`. + +### API Key Format & Storage + +- Raw form: `pic_` — ~56 chars total. +- Stored: 8-char prefix + Argon2id PHC hash of the body. Raw value returned **exactly once** in the `POST /api/v1/admin/api-keys` response; never logged, never readable again. +- Optional `expires_at`. Lookup queries always filter `expires_at IS NULL OR expires_at > NOW()`. +- Optional `app_id` ("bound key") — every `App*(other_app)` capability is denied for this key, regardless of the user's role. + +### Scope Set (intentionally narrow) + +Exactly seven scopes; no further subdivision until a real use case appears: + +`script:read`, `script:write`, `route:write`, `domain:manage`, `log:read`, `app:admin`, `instance:admin` + +Mint-time validation rejects unknown values. Bound keys (`app_id` set) cannot carry `instance:*` scopes — the combination is irreconcilable (a bound credential cannot claim instance-wide authority) and is rejected with 422. + +### Effective Capability — `can(principal, capability)` + +``` +allow = role_grants(principal.instance_role, capability) + ∧ (principal.scopes.is_none() ∨ required_scope(capability) ∈ principal.scopes) + ∧ (principal.app_binding.is_none() ∨ capability.app_id() == principal.app_binding) +``` + +`role_grants` collapses the three tables (instance role + implicit app grants + explicit `app_members`) into a single yes/no. Each handler calls `state.authz.require(&principal, Capability::AppWrite(script.app_id))` after loading the resource (so the capability binds to the resource's actual `app_id`, not a path param the caller controls). + +### Deactivation Symmetry + +Phase 3a's `set_active(false)` wipes that user's `admin_sessions`. Phase 3.5 extends it to also set `expires_at = NOW()` on every row in `api_keys WHERE user_id = $1` — both credential surfaces become inert at the same moment, no enumeration window. + +### CLI Auth Posture (forward note) + +The eventual `picloud` CLI authenticates by **paste-the-token**, not OAuth: the user runs `picloud login`, the dashboard mints a fresh key (or the user mints one via `POST /api/v1/admin/api-keys`), and the CLI prompts for the raw token. The CLI binary itself is deferred; the dashboard surface and the bearer credential type land here so the CLI is a thin wrapper when it arrives. + +### Schema (Migration 0006) + +```sql +ALTER TABLE admin_users + ADD COLUMN instance_role TEXT NOT NULL DEFAULT 'owner' + CHECK (instance_role IN ('owner','admin','member')), + ADD COLUMN email TEXT UNIQUE, + ADD COLUMN mfa_secret TEXT; -- reserved slot, not built + +CREATE TABLE app_members ( + app_id UUID NOT NULL REFERENCES apps(id) ON DELETE CASCADE, + user_id UUID NOT NULL REFERENCES admin_users(id) ON DELETE CASCADE, + role TEXT NOT NULL CHECK (role IN ('app_admin','editor','viewer')), + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + PRIMARY KEY (app_id, user_id) +); +CREATE INDEX app_members_user_id_idx ON app_members (user_id); + +CREATE TABLE api_keys ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + user_id UUID NOT NULL REFERENCES admin_users(id) ON DELETE CASCADE, + hash TEXT NOT NULL, -- Argon2id PHC + prefix TEXT NOT NULL, -- first 8 chars after `pic_` + name TEXT NOT NULL, + scopes TEXT[] NOT NULL, -- intersected with role at check time + app_id UUID NULL REFERENCES apps(id) ON DELETE CASCADE, + expires_at TIMESTAMPTZ NULL, + last_used_at TIMESTAMPTZ NULL, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() +); +CREATE INDEX api_keys_prefix_idx ON api_keys (prefix); +CREATE INDEX api_keys_user_id_idx ON api_keys (user_id); + +-- Reserved (not built this phase): +-- invites (token, email, instance_role, app_id, app_role, invited_by, expires_at, consumed_at) +-- service_accounts (id, name, owning_user_id, …) +``` + +### New Endpoints (additive — no API major bump) + +``` +POST /api/v1/admin/api-keys — { name, scopes[], app_id?, expires_at? } + → 201 { …, raw_token } (raw returned exactly once) +GET /api/v1/admin/api-keys — list caller's own keys (no raw) +DELETE /api/v1/admin/api-keys/{id} — caller's own only +``` + +Every existing `/api/v1/admin/*` endpoint is re-gated from "any authed admin" to a specific `Capability`. Request/response shapes are unchanged; what changes is the set of callers each endpoint accepts (a `member` now gets 403 on app surfaces they're not part of, where before they would have been 401-or-200 depending only on session validity). + +### Out of Scope (Phase 3.5) + +Schema room only, not built: + +- **Invites** — email-based join flow; `invites` table reserved in the migration comment block. +- **MFA / TOTP** — `mfa_secret` column reserved on `admin_users`. +- **Service accounts** — reserved as a future table; for now, every API key belongs to a human `admin_users` row. + +Defer to follow-up sessions: dashboard surfaces for invites / member management / key minting (curl is the supported interface this phase), OIDC / SAML / SCIM, the `picloud` CLI binary itself, email/SMTP delivery of invites, audit log shipping. + +--- + ## 12. Development Roadmap ### Phase 1: MVP ✓ (Shipped) @@ -1048,13 +1194,15 @@ The scripts and routes endpoints keep their existing shape — this avoids forci ### Phase 3: v1.0.x — Foundations (Current focus) -Two foundation pieces that must land before the v1.1 service expansion, because retrofitting them later is expensive. +Three foundation pieces that must land before the v1.1 service expansion, because retrofitting them later is expensive. **3a. Admin auth** — ✓ shipped. See section 11.4. Per-user `admin_users` (not a shared secret), Argon2id passwords, env-var bootstrap of the first admin, session-token doubling as bearer token for API. No roles in this cut; schema is forward-compatible with later RBAC. **3b. Multi-app scoping** — ✓ shipped. See section 11.5. `apps`, `app_domains`, `app_slug_history` tables; `app_id` columns on `scripts`, `routes`, `execution_logs`. Migration assigns existing data to a `default` app and always claims `localhost`; a Rust-side bootstrap inserts a `Hello World` script + `/hello` route when the default app is empty. Orchestrator dispatch is two-phase (Host → app → route trie). `/api/v1/execute/{id}/*` continues to work without a public domain claim. Dashboard is app-hierarchical (`/admin/apps`, `/admin/apps/{slug}/...`); API stays flat with new endpoints under `/api/v1/admin/apps/*` and a `?app=` filter on script listing. Per-app admin roles deferred. -**Why both before v1.1**: every v1.1 service (KV, docs, users, etc.) needs an `app_id` scoping key in its schema. Adding it now, with one small migration on existing tables, is cheap. Adding it after those services ship is several migrations on populated data. +**3c. Users, roles, and bearer-token auth** — pending. See section 11.6. Adds `instance_role` to `admin_users` (`owner`/`admin`/`member`), `app_members` for per-app `app_admin`/`editor`/`viewer` grants, and `api_keys` for `Authorization: Bearer pic_…` credentials. Unifies cookie-session and API-key paths behind a single `can(principal, capability)` gate; list endpoints filter by membership at SQL for `member` users. Dashboard surfaces, invites, MFA, service accounts, and the `picloud` CLI binary are deferred — schema room only. + +**Why all three before v1.1**: every v1.1 service (KV, docs, users, etc.) needs both an `app_id` scoping key in its schema and a `Principal` to authorize against. Adding both now is one migration each on a small surface; adding them after the SDKs ship is many migrations on populated data plus a re-gate of every SDK call. ---