feat: custom routing — bind scripts to your own URLs

Scripts can now answer at user-chosen paths (e.g. /greet, /greet/:name,
/webhooks/*), on user-chosen hosts (strict or *.example.com wildcards),
on user-chosen methods. The internal /api/v1/execute/{id} endpoint
stays as the always-available ID-based bypass.

Routing rules (decided in design with the user; see chat history):

  Path kinds:
    exact   /greet              literal
    prefix  /greet/*            strict-subtree; stored as "/greet/";
                                does NOT match bare /greet (add an
                                exact route for that case)
    param   /users/:id          :name captures one whole segment;
                                mid-segment colons are rejected;
                                {name} is reserved for a future SDK

  Host kinds:
    any                         no Host header constraint
    strict  sub.example.com     literal match (case-insensitive)
    wildcard *.example.com      suffix match; multi-level subdomains OK

  Within-kind uniqueness:
    two routes of the same kind that could match the same request
    conflict at config time. Algorithm (orchestrator_core::routing::
    conflict):
      exact:  literal equality
      prefix: literal equality (longer-prefix coexists; longer wins
              at request time)
      param:  same segment count + same literals at every
              literal-vs-literal position (the user's example:
              :id vs :userId at same shape is a conflict)

  Request-time precedence:
    exact > param > prefix
    among non-exact: more leading-literal segments wins
    tie: param > prefix (more constrained)
    within prefix: longest matching prefix wins
    host bucket: strict > wildcard (longer suffix) > any; fall through
    to less specific buckets when path doesn't match

  Reserved path prefixes: /api/, /admin/, /healthz, /version

  Routes that look invalid at config time return 422 with the precise
  parse error; conflicting routes return 409 with the conflicting route
  in the body (so the dashboard can render the conflict inline).

What landed:

  * 0003_routes.sql — routes table (host_kind, host, host_param_name,
    path_kind, path, method, script_id) with UNIQUE index on the
    literal binding tuple. Schema 2 → 3.

  * shared::Route / HostKind / PathKind — flat storage shape that
    crosses wire boundaries cleanly.

  * orchestrator_core::routing — four sub-modules, all unit-tested:
      pattern.rs (16 tests)  parse + validate + display
      conflict.rs (12 tests) within-kind overlap predicate
      matcher.rs (12 tests)  runtime dispatch (specificity-aware)
      table.rs               Arc<RwLock<Vec<CompiledRoute>>>
                             shared by manager (writes) and
                             orchestrator (reads); atomic replace
                             after each admin write

  * manager-core::route_admin — five new admin endpoints under
    /api/v1/admin:
      POST   /scripts/{id}/routes      create
      GET    /scripts/{id}/routes      list per script
      DELETE /routes/{route_id}        delete (refreshes table)
      POST   /routes:check             pre-flight conflict check
                                       (powers the dashboard's
                                       live conflict warning)
      POST   /routes:match             synthetic URL → matched
                                       route + extracted params
                                       (powers the dashboard's
                                       match-preview tool)
    Stored path strings stay raw (user-typed); normalization
    happens only in the in-memory CompiledRoute so re-parses are
    idempotent.

  * orchestrator_core::api::user_routes_router — fallback handler
    mounted in picloud after the system routes. Reads Host /
    method / path / query from the request, dispatches via the
    table, builds an ExecRequest with params/query/rest filled,
    calls the executor, writes to the log sink. 10 MiB body cap.

  * executor-core::ctx (SDK 1.0 → 1.1) — adds
      ctx.request.params  (map of named-param captures)
      ctx.request.query   (parsed query string)
      ctx.request.rest    (suffix for prefix routes; "" otherwise)
    All three are always present (empty when not applicable) so
    scripts can read them unconditionally.

  * picloud::build_app — now async; loads routes at startup,
    populates the shared table, mounts route_admin_router under
    /api/v1/admin alongside the script CRUD, and the user-routes
    fallback at the app root.

  * caddy/Caddyfile + Caddyfile.prod widened: anything not
    /healthz, /version, /api/v1/admin/*, /api/v1/execute/*,
    /api/* (404 sunset), or /admin/* (dashboard) → picloud.

  * Dashboard moves to /admin/* via SvelteKit paths.base. Its
    internal Caddy strips the prefix and serves with SPA fallback.
    All in-app links use $app/paths. The dashboard URL is now
    http://localhost:8000/admin/ — one-time break for the new
    URL freedom users gained.

  * PICLOUD_PUBLIC_BASE_URL env var, exposed via /version so the
    dashboard renders full URLs for routes regardless of the
    operator's external port / TLS setup.

  * memory_limit_mb stays in the schema, still v1.3+ advisory.

Verified live through Caddy:
  /version              → schema 3, sdk 1.1, public_base_url
  GET /admin/           → 200, dashboard HTML containing "PiCloud"
  POST /api/v1/admin/scripts → 201
  POST .../scripts/{id}/routes (path=/greet/:name) → 201
  GET /greet/alice?lang=en → 200 {"name":"alice","q":"en"}
  POST conflicting route → 409 with conflicting_route body
  POST /admin/foo route → 422 "reserved"
  POST /api/v1/admin/routes:match → matched + params extracted
  GET /unbound-path → 404 JSON

Tests:
  * 40 routing unit tests (pattern + conflict + matcher tables)
  * 14 executor-core unit tests (one new for ctx.request.params/
    query/rest exposure)
  * 32 integration tests (10 new for routing CRUD + dispatch +
    conflict + reserved + specificity tie-break + match preview +
    delete invalidation + /version returns public_base_url)
  * default cargo test --workspace stays green; opt-in via
    DATABASE_URL + --include-ignored for the integration suite

Bumps: schema 2 → 3; SDK 1.0 → 1.1; product 0.3.0 → 0.4.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-05-23 18:18:16 +02:00
parent f51924fdbc
commit 07e2a62d98
36 changed files with 2449 additions and 111 deletions

View File

@@ -8,7 +8,7 @@ use std::time::Duration;
use axum::{
body::Bytes,
extract::{Path, State},
extract::{Path, Request, State},
http::{HeaderMap, HeaderName, HeaderValue, StatusCode},
response::{IntoResponse, Response},
routing::post,
@@ -24,12 +24,16 @@ use uuid::Uuid;
use crate::client::ExecutorClient;
use crate::resolver::{ResolverError, ScriptResolver};
use crate::routing::RouteTable;
/// State shared by data-plane handlers.
pub struct DataPlaneState<E, R> {
pub executor: Arc<E>,
pub resolver: Arc<R>,
pub log_sink: Arc<dyn ExecutionLogSink>,
/// Routing table for user-defined paths. Shared with the manager
/// (admin router writes; this side reads).
pub routes: Arc<RouteTable>,
}
impl<E, R> Clone for DataPlaneState<E, R> {
@@ -38,11 +42,13 @@ impl<E, R> Clone for DataPlaneState<E, R> {
executor: self.executor.clone(),
resolver: self.resolver.clone(),
log_sink: self.log_sink.clone(),
routes: self.routes.clone(),
}
}
}
/// Build the data-plane router. Handles `POST /execute/:id`.
/// Build the data-plane router. Handles `POST /execute/:id` — the
/// always-available ID-based bypass.
pub fn data_plane_router<E, R>(state: DataPlaneState<E, R>) -> Router
where
E: ExecutorClient + 'static,
@@ -53,6 +59,19 @@ where
.with_state(state)
}
/// Build a router that handles ALL paths via the user-defined routing
/// table. Intended to be merged into the picloud app router as a
/// fallback (after the system routes are mounted).
pub fn user_routes_router<E, R>(state: DataPlaneState<E, R>) -> Router
where
E: ExecutorClient + 'static,
R: ScriptResolver + 'static,
{
Router::new()
.fallback(user_route_handler::<E, R>)
.with_state(state)
}
// ----------------------------------------------------------------------------
// Handlers
// ----------------------------------------------------------------------------
@@ -106,6 +125,113 @@ where
Ok(exec_response_to_http(outcome?))
}
async fn user_route_handler<E, R>(
State(state): State<DataPlaneState<E, R>>,
request: Request,
) -> Result<Response, ApiError>
where
E: ExecutorClient + 'static,
R: ScriptResolver + 'static,
{
let method = request.method().as_str().to_string();
let uri = request.uri().clone();
let path = uri.path().to_string();
let query_str = uri.query().unwrap_or("").to_string();
let host = request
.headers()
.get("host")
.and_then(|h| h.to_str().ok())
.unwrap_or("")
.to_string();
let headers = request.headers().clone();
let Some(matched) = state.routes.match_request(&host, &method, &path) else {
return Ok((
StatusCode::NOT_FOUND,
Json(serde_json::json!({
"error": format!("no route matches {method} {path}")
})),
)
.into_response());
};
let script = state
.resolver
.resolve(matched.matched.script_id)
.await?
.ok_or(ApiError::NotFound(matched.matched.script_id))?;
// Drain the body now that we know we'll execute. 10 MiB cap matches
// the conservative default response/request size in the blueprint.
let body_bytes = match axum::body::to_bytes(request.into_body(), 10 * 1024 * 1024).await {
Ok(b) => b,
Err(e) => return Err(ApiError::BadRequest(format!("body read failed: {e}"))),
};
let mut req = build_exec_request(
matched.matched.script_id,
&script.name,
&headers,
&body_bytes,
)?;
req.path = path;
req.params = matched.params;
req.query = parse_query_string(&query_str);
req.rest = matched.rest.unwrap_or_default();
req.sandbox_overrides = script.sandbox;
let request_id = req.request_id;
let request_path = req.path.clone();
let request_headers = req.headers.clone();
let request_body = req.body.clone();
let timeout = Duration::from_secs(u64::from(script.timeout_seconds));
let started = Utc::now();
let outcome = state.executor.execute(&script.source, req, timeout).await;
let finished = Utc::now();
let log = build_execution_log(
matched.matched.script_id,
request_id,
request_path,
request_headers,
request_body,
&outcome,
started,
finished,
);
if let Err(e) = state.log_sink.record(log).await {
tracing::warn!(
error = %e,
script_id = %matched.matched.script_id,
"failed to persist execution log"
);
}
Ok(exec_response_to_http(outcome?))
}
fn parse_query_string(s: &str) -> BTreeMap<String, String> {
let mut out = BTreeMap::new();
if s.is_empty() {
return out;
}
for pair in s.split('&') {
let (k, v) = match pair.split_once('=') {
Some((k, v)) => (k, v),
None => (pair, ""),
};
let key = urlencoding::decode(k)
.map(std::borrow::Cow::into_owned)
.unwrap_or_default();
let val = urlencoding::decode(v)
.map(std::borrow::Cow::into_owned)
.unwrap_or_default();
out.insert(key, val);
}
out
}
// ----------------------------------------------------------------------------
// Marshalling
// ----------------------------------------------------------------------------
@@ -139,6 +265,9 @@ fn build_exec_request(
path: format!("/api/execute/{id}"),
headers: hmap,
body: body_json,
params: BTreeMap::new(),
query: BTreeMap::new(),
rest: String::new(),
// Overwritten by the handler after the script is resolved.
sandbox_overrides: picloud_shared::ScriptSandbox::default(),
})