feat: custom routing — bind scripts to your own URLs

Scripts can now answer at user-chosen paths (e.g. /greet, /greet/:name,
/webhooks/*), on user-chosen hosts (strict or *.example.com wildcards),
on user-chosen methods. The internal /api/v1/execute/{id} endpoint
stays as the always-available ID-based bypass.

Routing rules (decided in design with the user; see chat history):

  Path kinds:
    exact   /greet              literal
    prefix  /greet/*            strict-subtree; stored as "/greet/";
                                does NOT match bare /greet (add an
                                exact route for that case)
    param   /users/:id          :name captures one whole segment;
                                mid-segment colons are rejected;
                                {name} is reserved for a future SDK

  Host kinds:
    any                         no Host header constraint
    strict  sub.example.com     literal match (case-insensitive)
    wildcard *.example.com      suffix match; multi-level subdomains OK

  Within-kind uniqueness:
    two routes of the same kind that could match the same request
    conflict at config time. Algorithm (orchestrator_core::routing::
    conflict):
      exact:  literal equality
      prefix: literal equality (longer-prefix coexists; longer wins
              at request time)
      param:  same segment count + same literals at every
              literal-vs-literal position (the user's example:
              :id vs :userId at same shape is a conflict)

  Request-time precedence:
    exact > param > prefix
    among non-exact: more leading-literal segments wins
    tie: param > prefix (more constrained)
    within prefix: longest matching prefix wins
    host bucket: strict > wildcard (longer suffix) > any; fall through
    to less specific buckets when path doesn't match

  Reserved path prefixes: /api/, /admin/, /healthz, /version

  Routes that look invalid at config time return 422 with the precise
  parse error; conflicting routes return 409 with the conflicting route
  in the body (so the dashboard can render the conflict inline).

What landed:

  * 0003_routes.sql — routes table (host_kind, host, host_param_name,
    path_kind, path, method, script_id) with UNIQUE index on the
    literal binding tuple. Schema 2 → 3.

  * shared::Route / HostKind / PathKind — flat storage shape that
    crosses wire boundaries cleanly.

  * orchestrator_core::routing — four sub-modules, all unit-tested:
      pattern.rs (16 tests)  parse + validate + display
      conflict.rs (12 tests) within-kind overlap predicate
      matcher.rs (12 tests)  runtime dispatch (specificity-aware)
      table.rs               Arc<RwLock<Vec<CompiledRoute>>>
                             shared by manager (writes) and
                             orchestrator (reads); atomic replace
                             after each admin write

  * manager-core::route_admin — five new admin endpoints under
    /api/v1/admin:
      POST   /scripts/{id}/routes      create
      GET    /scripts/{id}/routes      list per script
      DELETE /routes/{route_id}        delete (refreshes table)
      POST   /routes:check             pre-flight conflict check
                                       (powers the dashboard's
                                       live conflict warning)
      POST   /routes:match             synthetic URL → matched
                                       route + extracted params
                                       (powers the dashboard's
                                       match-preview tool)
    Stored path strings stay raw (user-typed); normalization
    happens only in the in-memory CompiledRoute so re-parses are
    idempotent.

  * orchestrator_core::api::user_routes_router — fallback handler
    mounted in picloud after the system routes. Reads Host /
    method / path / query from the request, dispatches via the
    table, builds an ExecRequest with params/query/rest filled,
    calls the executor, writes to the log sink. 10 MiB body cap.

  * executor-core::ctx (SDK 1.0 → 1.1) — adds
      ctx.request.params  (map of named-param captures)
      ctx.request.query   (parsed query string)
      ctx.request.rest    (suffix for prefix routes; "" otherwise)
    All three are always present (empty when not applicable) so
    scripts can read them unconditionally.

  * picloud::build_app — now async; loads routes at startup,
    populates the shared table, mounts route_admin_router under
    /api/v1/admin alongside the script CRUD, and the user-routes
    fallback at the app root.

  * caddy/Caddyfile + Caddyfile.prod widened: anything not
    /healthz, /version, /api/v1/admin/*, /api/v1/execute/*,
    /api/* (404 sunset), or /admin/* (dashboard) → picloud.

  * Dashboard moves to /admin/* via SvelteKit paths.base. Its
    internal Caddy strips the prefix and serves with SPA fallback.
    All in-app links use $app/paths. The dashboard URL is now
    http://localhost:8000/admin/ — one-time break for the new
    URL freedom users gained.

  * PICLOUD_PUBLIC_BASE_URL env var, exposed via /version so the
    dashboard renders full URLs for routes regardless of the
    operator's external port / TLS setup.

  * memory_limit_mb stays in the schema, still v1.3+ advisory.

Verified live through Caddy:
  /version              → schema 3, sdk 1.1, public_base_url
  GET /admin/           → 200, dashboard HTML containing "PiCloud"
  POST /api/v1/admin/scripts → 201
  POST .../scripts/{id}/routes (path=/greet/:name) → 201
  GET /greet/alice?lang=en → 200 {"name":"alice","q":"en"}
  POST conflicting route → 409 with conflicting_route body
  POST /admin/foo route → 422 "reserved"
  POST /api/v1/admin/routes:match → matched + params extracted
  GET /unbound-path → 404 JSON

Tests:
  * 40 routing unit tests (pattern + conflict + matcher tables)
  * 14 executor-core unit tests (one new for ctx.request.params/
    query/rest exposure)
  * 32 integration tests (10 new for routing CRUD + dispatch +
    conflict + reserved + specificity tie-break + match preview +
    delete invalidation + /version returns public_base_url)
  * default cargo test --workspace stays green; opt-in via
    DATABASE_URL + --include-ignored for the integration suite

Bumps: schema 2 → 3; SDK 1.0 → 1.1; product 0.3.0 → 0.4.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-05-23 18:18:16 +02:00
parent f51924fdbc
commit 07e2a62d98
36 changed files with 2449 additions and 111 deletions

View File

@@ -22,3 +22,4 @@ uuid.workspace = true
chrono.workspace = true
reqwest.workspace = true
tokio.workspace = true
urlencoding.workspace = true

View File

@@ -8,7 +8,7 @@ use std::time::Duration;
use axum::{
body::Bytes,
extract::{Path, State},
extract::{Path, Request, State},
http::{HeaderMap, HeaderName, HeaderValue, StatusCode},
response::{IntoResponse, Response},
routing::post,
@@ -24,12 +24,16 @@ use uuid::Uuid;
use crate::client::ExecutorClient;
use crate::resolver::{ResolverError, ScriptResolver};
use crate::routing::RouteTable;
/// State shared by data-plane handlers.
pub struct DataPlaneState<E, R> {
pub executor: Arc<E>,
pub resolver: Arc<R>,
pub log_sink: Arc<dyn ExecutionLogSink>,
/// Routing table for user-defined paths. Shared with the manager
/// (admin router writes; this side reads).
pub routes: Arc<RouteTable>,
}
impl<E, R> Clone for DataPlaneState<E, R> {
@@ -38,11 +42,13 @@ impl<E, R> Clone for DataPlaneState<E, R> {
executor: self.executor.clone(),
resolver: self.resolver.clone(),
log_sink: self.log_sink.clone(),
routes: self.routes.clone(),
}
}
}
/// Build the data-plane router. Handles `POST /execute/:id`.
/// Build the data-plane router. Handles `POST /execute/:id` — the
/// always-available ID-based bypass.
pub fn data_plane_router<E, R>(state: DataPlaneState<E, R>) -> Router
where
E: ExecutorClient + 'static,
@@ -53,6 +59,19 @@ where
.with_state(state)
}
/// Build a router that handles ALL paths via the user-defined routing
/// table. Intended to be merged into the picloud app router as a
/// fallback (after the system routes are mounted).
pub fn user_routes_router<E, R>(state: DataPlaneState<E, R>) -> Router
where
E: ExecutorClient + 'static,
R: ScriptResolver + 'static,
{
Router::new()
.fallback(user_route_handler::<E, R>)
.with_state(state)
}
// ----------------------------------------------------------------------------
// Handlers
// ----------------------------------------------------------------------------
@@ -106,6 +125,113 @@ where
Ok(exec_response_to_http(outcome?))
}
async fn user_route_handler<E, R>(
State(state): State<DataPlaneState<E, R>>,
request: Request,
) -> Result<Response, ApiError>
where
E: ExecutorClient + 'static,
R: ScriptResolver + 'static,
{
let method = request.method().as_str().to_string();
let uri = request.uri().clone();
let path = uri.path().to_string();
let query_str = uri.query().unwrap_or("").to_string();
let host = request
.headers()
.get("host")
.and_then(|h| h.to_str().ok())
.unwrap_or("")
.to_string();
let headers = request.headers().clone();
let Some(matched) = state.routes.match_request(&host, &method, &path) else {
return Ok((
StatusCode::NOT_FOUND,
Json(serde_json::json!({
"error": format!("no route matches {method} {path}")
})),
)
.into_response());
};
let script = state
.resolver
.resolve(matched.matched.script_id)
.await?
.ok_or(ApiError::NotFound(matched.matched.script_id))?;
// Drain the body now that we know we'll execute. 10 MiB cap matches
// the conservative default response/request size in the blueprint.
let body_bytes = match axum::body::to_bytes(request.into_body(), 10 * 1024 * 1024).await {
Ok(b) => b,
Err(e) => return Err(ApiError::BadRequest(format!("body read failed: {e}"))),
};
let mut req = build_exec_request(
matched.matched.script_id,
&script.name,
&headers,
&body_bytes,
)?;
req.path = path;
req.params = matched.params;
req.query = parse_query_string(&query_str);
req.rest = matched.rest.unwrap_or_default();
req.sandbox_overrides = script.sandbox;
let request_id = req.request_id;
let request_path = req.path.clone();
let request_headers = req.headers.clone();
let request_body = req.body.clone();
let timeout = Duration::from_secs(u64::from(script.timeout_seconds));
let started = Utc::now();
let outcome = state.executor.execute(&script.source, req, timeout).await;
let finished = Utc::now();
let log = build_execution_log(
matched.matched.script_id,
request_id,
request_path,
request_headers,
request_body,
&outcome,
started,
finished,
);
if let Err(e) = state.log_sink.record(log).await {
tracing::warn!(
error = %e,
script_id = %matched.matched.script_id,
"failed to persist execution log"
);
}
Ok(exec_response_to_http(outcome?))
}
fn parse_query_string(s: &str) -> BTreeMap<String, String> {
let mut out = BTreeMap::new();
if s.is_empty() {
return out;
}
for pair in s.split('&') {
let (k, v) = match pair.split_once('=') {
Some((k, v)) => (k, v),
None => (pair, ""),
};
let key = urlencoding::decode(k)
.map(std::borrow::Cow::into_owned)
.unwrap_or_default();
let val = urlencoding::decode(v)
.map(std::borrow::Cow::into_owned)
.unwrap_or_default();
out.insert(key, val);
}
out
}
// ----------------------------------------------------------------------------
// Marshalling
// ----------------------------------------------------------------------------
@@ -139,6 +265,9 @@ fn build_exec_request(
path: format!("/api/execute/{id}"),
headers: hmap,
body: body_json,
params: BTreeMap::new(),
query: BTreeMap::new(),
rest: String::new(),
// Overwritten by the handler after the script is resolved.
sandbox_overrides: picloud_shared::ScriptSandbox::default(),
})

View File

@@ -11,7 +11,8 @@
pub mod api;
pub mod client;
pub mod resolver;
pub mod routing;
pub use api::{data_plane_router, DataPlaneState};
pub use api::{data_plane_router, user_routes_router, DataPlaneState};
pub use client::{ExecutorClient, LocalExecutorClient, RemoteExecutorClient};
pub use resolver::{ResolverError, ScriptResolver};

View File

@@ -0,0 +1,238 @@
//! Within-kind overlap detection.
//!
//! Two routes "conflict" if they're the same kind AND there exists at
//! least one request that both would match. Cross-kind overlap is OK —
//! the runtime matcher resolves it via the precedence rule
//! (`exact > param > prefix`, with leading-literal specificity).
//!
//! The user's design intent (see chat history): "There should only be
//! one eligible route in each domain for a path within the same kind."
//! This file is the predicate that enforces it at write time.
use super::pattern::{HostPattern, PathPattern, PathSegment};
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum ConflictReason {
/// Two exact paths with identical literal value.
IdenticalExact,
/// Two prefix paths with identical literal value (different
/// lengths coexist; the longer wins at runtime).
IdenticalPrefix,
/// Two param paths with the same shape (same segment count and
/// matching literals at every literal-vs-literal position).
OverlappingParam,
}
/// True if these two patterns conflict (within-kind only). Callers
/// should already have decided the host/method dimensions overlap.
#[must_use]
pub fn conflicts(a: &PathPattern, b: &PathPattern) -> Option<ConflictReason> {
match (a, b) {
(PathPattern::Exact(x), PathPattern::Exact(y)) if x == y => {
Some(ConflictReason::IdenticalExact)
}
(PathPattern::Prefix(x), PathPattern::Prefix(y)) if x == y => {
Some(ConflictReason::IdenticalPrefix)
}
(PathPattern::Param(xs), PathPattern::Param(ys)) if param_overlap(xs, ys) => {
Some(ConflictReason::OverlappingParam)
}
_ => None,
}
}
/// Two host patterns "share a domain bucket" — i.e., a request to one
/// host satisfies both — exactly when their parsed forms agree on kind
/// and value. Conflict-checks across different domain buckets do not
/// apply (the runtime host precedence picks one bucket per request).
#[must_use]
pub fn hosts_overlap(a: &HostPattern, b: &HostPattern) -> bool {
match (a, b) {
(HostPattern::Any, HostPattern::Any) => true,
(HostPattern::Strict(x), HostPattern::Strict(y)) => x == y,
(HostPattern::Wildcard { suffix: x, .. }, HostPattern::Wildcard { suffix: y, .. }) => {
x == y
}
_ => false,
}
}
/// True if the two HTTP methods would dispatch to the same route.
/// `None` = "any method"; two Anys overlap, an Any overlaps any
/// specific method, and two specifics overlap iff equal.
#[must_use]
pub fn methods_overlap(a: Option<&str>, b: Option<&str>) -> bool {
match (a, b) {
(None, _) | (_, None) => true,
(Some(x), Some(y)) => x.eq_ignore_ascii_case(y),
}
}
fn param_overlap(a: &[PathSegment], b: &[PathSegment]) -> bool {
if a.len() != b.len() {
return false;
}
a.iter().zip(b.iter()).all(|(x, y)| match (x, y) {
(PathSegment::Literal(p), PathSegment::Literal(q)) => p == q,
// Any pair involving at least one Param accepts overlap.
_ => true,
})
}
// ----------------------------------------------------------------------------
// Tests
// ----------------------------------------------------------------------------
#[cfg(test)]
mod tests {
use super::super::pattern::{parse_path, PathPattern};
use super::*;
use picloud_shared::PathKind;
fn p(kind: PathKind, raw: &str) -> PathPattern {
parse_path(kind, raw).unwrap()
}
#[test]
fn identical_exacts_conflict() {
assert_eq!(
conflicts(&p(PathKind::Exact, "/x"), &p(PathKind::Exact, "/x")),
Some(ConflictReason::IdenticalExact)
);
}
#[test]
fn different_exacts_dont_conflict() {
assert_eq!(
conflicts(&p(PathKind::Exact, "/x"), &p(PathKind::Exact, "/y")),
None
);
}
#[test]
fn identical_prefixes_conflict() {
assert_eq!(
conflicts(
&p(PathKind::Prefix, "/greet/*"),
&p(PathKind::Prefix, "/greet/*")
),
Some(ConflictReason::IdenticalPrefix)
);
}
#[test]
fn different_length_prefixes_dont_conflict() {
assert_eq!(
conflicts(
&p(PathKind::Prefix, "/greet/*"),
&p(PathKind::Prefix, "/greet/sub/*")
),
None
);
}
#[test]
fn cross_kind_never_conflicts() {
assert_eq!(
conflicts(&p(PathKind::Exact, "/x"), &p(PathKind::Prefix, "/x/*")),
None
);
assert_eq!(
conflicts(
&p(PathKind::Exact, "/greet/alice"),
&p(PathKind::Param, "/greet/:name")
),
None
);
assert_eq!(
conflicts(
&p(PathKind::Param, "/greet/:name"),
&p(PathKind::Prefix, "/greet/*")
),
None
);
}
#[test]
fn same_shape_params_conflict_regardless_of_name() {
assert_eq!(
conflicts(
&p(PathKind::Param, "/users/:id"),
&p(PathKind::Param, "/users/:userId")
),
Some(ConflictReason::OverlappingParam)
);
}
#[test]
fn param_shapes_with_overlapping_literals_conflict() {
// /users/admin/posts is reachable by both.
assert_eq!(
conflicts(
&p(PathKind::Param, "/users/:id/posts"),
&p(PathKind::Param, "/users/admin/:action")
),
Some(ConflictReason::OverlappingParam)
);
}
#[test]
fn param_with_distinct_trailing_literal_no_conflict() {
assert_eq!(
conflicts(
&p(PathKind::Param, "/users/:id/posts"),
&p(PathKind::Param, "/users/:id/comments")
),
None
);
}
#[test]
fn params_of_different_lengths_no_conflict() {
assert_eq!(
conflicts(
&p(PathKind::Param, "/users/:id"),
&p(PathKind::Param, "/users/:id/posts")
),
None
);
}
#[test]
fn methods_any_overlaps_anything() {
assert!(methods_overlap(None, None));
assert!(methods_overlap(None, Some("GET")));
assert!(methods_overlap(Some("POST"), None));
}
#[test]
fn methods_case_insensitive() {
assert!(methods_overlap(Some("GET"), Some("get")));
}
#[test]
fn methods_different_specifics_no_overlap() {
assert!(!methods_overlap(Some("GET"), Some("POST")));
}
#[test]
fn hosts_strict_strict() {
let a = HostPattern::Strict("a.com".into());
let b = HostPattern::Strict("a.com".into());
let c = HostPattern::Strict("b.com".into());
assert!(hosts_overlap(&a, &b));
assert!(!hosts_overlap(&a, &c));
}
#[test]
fn hosts_strict_vs_wildcard_different_buckets() {
let s = HostPattern::Strict("sub.example.com".into());
let w = HostPattern::Wildcard {
suffix: "example.com".into(),
capture: None,
};
// They CAN match the same request at runtime; specificity
// picks one. They're not in the same conflict bucket.
assert!(!hosts_overlap(&s, &w));
}
}

View File

@@ -0,0 +1,454 @@
//! Runtime route matching.
//!
//! Given a `(host, method, path)` triple and a `RouteTable`, returns
//! the single best matching route plus extracted parameters, or `None`.
//!
//! Rules (decided in design discussion, see chat history):
//!
//! 1. Host dispatch: collect candidate routes whose host pattern
//! matches the request Host. `strict > wildcard (longer suffix
//! wins) > any`. Within the most-specific matching host bucket,
//! try path matching; if nothing matches, fall through to less
//! specific buckets.
//! 2. Within a host bucket, method must overlap (request method
//! equals route method, or route method is `any`).
//! 3. Path dispatch by kind: exact wins absolute; among non-exact
//! matches, more leading literal segments wins; tie → `param >
//! prefix`. Within prefix, longest matching prefix wins.
use std::collections::BTreeMap;
use super::pattern::{HostPattern, PathPattern, PathSegment};
#[derive(Debug, Clone)]
pub struct MatchResult {
pub matched: Matched,
pub params: BTreeMap<String, String>,
/// `Some(rest)` when the matched route is a prefix; the remainder
/// of the URL path after the matched prefix, with no leading slash.
pub rest: Option<String>,
/// Captured host parameter (future `{tenant}.example.com`); always
/// `None` for the v1.1 SDK since the brace syntax is reserved.
pub host_param: Option<(String, String)>,
}
/// Reference back into the caller's route storage. Generic so callers
/// can carry their own `Route` shape; only the patterns are required.
#[derive(Debug, Clone)]
pub struct Matched {
pub route_id: uuid::Uuid,
pub script_id: picloud_shared::ScriptId,
}
/// A single route ready for matching.
#[derive(Debug, Clone)]
pub struct CompiledRoute {
pub route_id: uuid::Uuid,
pub script_id: picloud_shared::ScriptId,
pub host: HostPattern,
pub path: PathPattern,
pub method: Option<String>,
}
/// Find the best matching route for the request. Returns `None` if no
/// route is eligible at any host-specificity level.
#[must_use]
pub fn r#match<'a, I>(
routes: I,
request_host: &str,
request_method: &str,
request_path: &str,
) -> Option<MatchResult>
where
I: IntoIterator<Item = &'a CompiledRoute>,
{
let candidates: Vec<&CompiledRoute> = routes.into_iter().collect();
// Group by host-specificity, descending. We try each bucket in
// turn — the moment we find a path match within a bucket, return.
let buckets = bucket_by_host(&candidates, request_host);
for bucket in buckets {
if let Some(result) = match_within_bucket(&bucket, request_method, request_path) {
return Some(result);
}
}
None
}
/// Optional host capture (param-name + extracted label) when a wildcard
/// match has the `{name}.example.com` capture syntax (deferred SDK
/// feature; always `None` today but the type carries through so we
/// don't have to thread it later).
type HostCapture = Option<(String, String)>;
type RouteHit<'a> = (&'a CompiledRoute, HostCapture);
/// Returns the routes whose host pattern accepts `request_host`,
/// grouped by host-specificity in descending order (most-specific
/// first). Higher-priority buckets are tried first by the matcher.
fn bucket_by_host<'a>(
candidates: &[&'a CompiledRoute],
request_host: &str,
) -> Vec<Vec<RouteHit<'a>>> {
let host_stripped = strip_port(request_host);
let mut hits: Vec<(HostBucket, &CompiledRoute, HostCapture)> = Vec::new();
for r in candidates {
if let Some(capture) = host_matches(&r.host, host_stripped) {
hits.push((host_bucket(&r.host), r, capture));
}
}
// Sort descending by bucket; group runs of equal buckets.
hits.sort_by(|a, b| b.0.cmp(&a.0));
let mut out: Vec<Vec<RouteHit<'a>>> = Vec::new();
let mut current_bucket: Option<HostBucket> = None;
for (bucket, route, capture) in hits {
if Some(bucket) == current_bucket {
out.last_mut().unwrap().push((route, capture));
} else {
current_bucket = Some(bucket);
out.push(vec![(route, capture)]);
}
}
out
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
enum HostBucket {
Any,
Wildcard(usize), // suffix length
Strict,
}
fn host_bucket(p: &HostPattern) -> HostBucket {
match p {
HostPattern::Any => HostBucket::Any,
HostPattern::Wildcard { suffix, .. } => HostBucket::Wildcard(suffix.len()),
HostPattern::Strict(_) => HostBucket::Strict,
}
}
#[allow(clippy::option_option)]
fn host_matches(pattern: &HostPattern, request_host: &str) -> Option<HostCapture> {
match pattern {
HostPattern::Any => Some(None),
HostPattern::Strict(value) => {
if value.eq_ignore_ascii_case(request_host) {
Some(None)
} else {
None
}
}
HostPattern::Wildcard { suffix, capture } => {
let host = request_host.to_ascii_lowercase();
let suffix = suffix.to_ascii_lowercase();
// Require: host ends with ".suffix" AND has at least one
// non-empty label before. Multi-level subdomains (a.b.suffix)
// do match — common platform convention for wildcard certs
// is one-level only, but for HTTP routing users almost
// always want "anything under this domain".
let dotted = format!(".{suffix}");
host.strip_suffix(&dotted)
.filter(|p| !p.is_empty())
.map(|label| capture.as_ref().map(|c| (c.clone(), label.to_string())))
}
}
}
fn strip_port(host: &str) -> &str {
host.split(':').next().unwrap_or(host)
}
fn match_within_bucket(
bucket: &[RouteHit<'_>],
request_method: &str,
request_path: &str,
) -> Option<MatchResult> {
// 1. Exact wins absolute.
for (route, host_param) in bucket {
if !method_matches(route.method.as_deref(), request_method) {
continue;
}
if let PathPattern::Exact(p) = &route.path {
if p == request_path {
return Some(MatchResult {
matched: Matched {
route_id: route.route_id,
script_id: route.script_id,
},
params: BTreeMap::new(),
rest: None,
host_param: host_param.clone(),
});
}
}
}
// 2. Among non-exact matches, score by leading-literal count;
// tie → param > prefix; tie among prefixes → longest wins.
let mut best: Option<(MatchScore, MatchResult)> = None;
for (route, host_param) in bucket {
if !method_matches(route.method.as_deref(), request_method) {
continue;
}
let (params, rest) = match &route.path {
PathPattern::Exact(_) => continue, // handled above
PathPattern::Prefix(prefix) => {
if let Some(rest) = request_path.strip_prefix(prefix.as_str()) {
// Successful prefix match. Empty rest is fine for
// "/*" matching "/", but for non-root the rest is
// typically a real path segment.
(BTreeMap::new(), Some(rest.to_string()))
} else {
continue;
}
}
PathPattern::Param(segs) => {
if let Some(captures) = match_param(segs, request_path) {
(captures, None)
} else {
continue;
}
}
};
let score = MatchScore {
leading_literals: route.path.leading_literal_count(),
kind_rank: kind_rank(&route.path),
prefix_len: match &route.path {
PathPattern::Prefix(p) => p.len(),
_ => 0,
},
};
let result = MatchResult {
matched: Matched {
route_id: route.route_id,
script_id: route.script_id,
},
params,
rest,
host_param: host_param.clone(),
};
match &best {
None => best = Some((score, result)),
Some((bs, _)) if score > *bs => best = Some((score, result)),
_ => {}
}
}
best.map(|(_, r)| r)
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
struct MatchScore {
leading_literals: usize,
kind_rank: u8, // higher = preferred at equal leading_literals
prefix_len: usize,
}
fn kind_rank(p: &PathPattern) -> u8 {
match p {
PathPattern::Exact(_) => 3, // handled separately but kept for completeness
PathPattern::Param(_) => 2,
PathPattern::Prefix(_) => 1,
}
}
fn method_matches(route_method: Option<&str>, request_method: &str) -> bool {
match route_method {
None => true,
Some(m) => m.eq_ignore_ascii_case(request_method),
}
}
fn match_param(segs: &[PathSegment], request_path: &str) -> Option<BTreeMap<String, String>> {
let req_segs: Vec<&str> = request_path
.trim_start_matches('/')
.split('/')
.filter(|s| !s.is_empty())
.collect();
if req_segs.len() != segs.len() {
return None;
}
let mut captures = BTreeMap::new();
for (pattern_seg, req_seg) in segs.iter().zip(req_segs.iter()) {
match pattern_seg {
PathSegment::Literal(lit) => {
if lit != req_seg {
return None;
}
}
PathSegment::Param(name) => {
captures.insert(name.clone(), (*req_seg).to_string());
}
}
}
Some(captures)
}
// ----------------------------------------------------------------------------
// Tests
// ----------------------------------------------------------------------------
#[cfg(test)]
mod tests {
use super::super::pattern::parse_path;
use super::*;
use picloud_shared::{PathKind, ScriptId};
use uuid::Uuid;
fn route(host: HostPattern, path_kind: PathKind, raw: &str) -> CompiledRoute {
CompiledRoute {
route_id: Uuid::new_v4(),
script_id: ScriptId::new(),
host,
path: parse_path(path_kind, raw).unwrap(),
method: None,
}
}
#[test]
fn exact_wins_over_param_and_prefix() {
let routes = vec![
route(HostPattern::Any, PathKind::Exact, "/greet"),
route(HostPattern::Any, PathKind::Param, "/:anything"),
route(HostPattern::Any, PathKind::Prefix, "/*"),
];
let r = r#match(&routes, "host", "GET", "/greet").unwrap();
assert_eq!(r.matched.route_id, routes[0].route_id);
}
#[test]
fn param_beats_prefix_at_equal_leading_literals() {
let routes = vec![
route(HostPattern::Any, PathKind::Prefix, "/foo/users/*"),
route(HostPattern::Any, PathKind::Param, "/foo/users/:id"),
];
let r = r#match(&routes, "host", "GET", "/foo/users/foo").unwrap();
assert_eq!(r.matched.route_id, routes[1].route_id);
assert_eq!(r.params.get("id").map(String::as_str), Some("foo"));
}
#[test]
fn longer_prefix_wins_over_shorter_prefix() {
let routes = vec![
route(HostPattern::Any, PathKind::Prefix, "/foo/*"),
route(HostPattern::Any, PathKind::Prefix, "/foo/users/*"),
];
let r = r#match(&routes, "host", "GET", "/foo/users/list").unwrap();
assert_eq!(r.matched.route_id, routes[1].route_id);
assert_eq!(r.rest.as_deref(), Some("list"));
}
#[test]
fn more_leading_literals_wins_across_kinds() {
// prefix has more leading literals than param → prefix wins.
let routes = vec![
route(HostPattern::Any, PathKind::Param, "/foo/:section"),
route(HostPattern::Any, PathKind::Prefix, "/foo/users/*"),
];
let r = r#match(&routes, "host", "GET", "/foo/users/list").unwrap();
assert_eq!(r.matched.route_id, routes[1].route_id);
}
#[test]
fn prefix_does_not_match_bare_root() {
let routes = vec![route(HostPattern::Any, PathKind::Prefix, "/greet/*")];
assert!(r#match(&routes, "host", "GET", "/greet").is_none());
assert!(r#match(&routes, "host", "GET", "/greet/x").is_some());
}
#[test]
fn host_strict_beats_wildcard() {
let routes = vec![
route(
HostPattern::Wildcard {
suffix: "example.com".into(),
capture: None,
},
PathKind::Exact,
"/greet",
),
route(
HostPattern::Strict("sub.example.com".into()),
PathKind::Exact,
"/greet",
),
];
// sub.example.com matches both; strict wins.
let r = r#match(&routes, "sub.example.com", "GET", "/greet").unwrap();
assert_eq!(r.matched.route_id, routes[1].route_id);
// other.example.com only matches wildcard.
let r2 = r#match(&routes, "other.example.com", "GET", "/greet").unwrap();
assert_eq!(r2.matched.route_id, routes[0].route_id);
}
#[test]
fn wildcard_requires_subdomain_label() {
let routes = vec![route(
HostPattern::Wildcard {
suffix: "example.com".into(),
capture: None,
},
PathKind::Exact,
"/x",
)];
// Bare apex shouldn't match a *.example.com wildcard.
assert!(r#match(&routes, "example.com", "GET", "/x").is_none());
// Two-level subdomains DO match (`.label.example.com` ends with `.example.com`).
assert!(r#match(&routes, "a.b.example.com", "GET", "/x").is_some());
}
#[test]
fn fallback_to_less_specific_host_when_path_misses() {
// Strict bucket has no path that matches; should fall through
// to the wildcard bucket.
let routes = vec![
route(
HostPattern::Strict("sub.example.com".into()),
PathKind::Exact,
"/admin-only",
),
route(
HostPattern::Wildcard {
suffix: "example.com".into(),
capture: None,
},
PathKind::Exact,
"/greet",
),
];
let r = r#match(&routes, "sub.example.com", "GET", "/greet").unwrap();
assert_eq!(r.matched.route_id, routes[1].route_id);
}
#[test]
fn method_filters_apply() {
let mut get_route = route(HostPattern::Any, PathKind::Exact, "/echo");
get_route.method = Some("GET".into());
let mut post_route = route(HostPattern::Any, PathKind::Exact, "/echo");
post_route.method = Some("POST".into());
let routes = vec![get_route.clone(), post_route.clone()];
let r1 = r#match(&routes, "host", "GET", "/echo").unwrap();
assert_eq!(r1.matched.route_id, get_route.route_id);
let r2 = r#match(&routes, "host", "POST", "/echo").unwrap();
assert_eq!(r2.matched.route_id, post_route.route_id);
assert!(r#match(&routes, "host", "DELETE", "/echo").is_none());
}
#[test]
fn host_with_port_strips_port() {
let routes = vec![route(
HostPattern::Strict("sub.example.com".into()),
PathKind::Exact,
"/x",
)];
let r = r#match(&routes, "sub.example.com:8443", "GET", "/x");
assert!(r.is_some());
}
}

View File

@@ -0,0 +1,28 @@
//! Custom routing: turning a `(host, method, path)` request into the
//! script that should answer it.
//!
//! This module is the single source of truth for how PiCloud routes
//! work. The user contract:
//!
//! * **Path kinds** — `exact`, `prefix`, `param`. Validation rules
//! in `pattern`; the orchestrator parses the raw stored strings
//! into typed `PathPattern` values for matching.
//! * **Host kinds** — `any`, `strict`, `wildcard`. Same shape.
//! * **Within-kind uniqueness** — two routes of the same kind that
//! could match the same request are a configuration error.
//! See `conflict`.
//! * **Cross-kind precedence at request time** — `exact` wins
//! absolute; among non-exact, more leading-literal segments
//! wins; tie → `param > prefix`. See `matcher`.
//! * **Host dispatch** — `strict > wildcard > any`; longest matching
//! wildcard suffix breaks ties between wildcards.
pub mod conflict;
pub mod matcher;
pub mod pattern;
pub mod table;
pub use conflict::{conflicts, ConflictReason};
pub use matcher::{MatchResult, Matched};
pub use pattern::{HostPattern, ParseError, PathPattern, PathSegment};
pub use table::RouteTable;

View File

@@ -0,0 +1,417 @@
//! Parsing + validation of host and path patterns.
//!
//! Both kinds round-trip via the storage form (`(kind, raw_string)`)
//! and the parsed form (`HostPattern` / `PathPattern`). The
//! orchestrator parses once at table load; matching and conflict-checks
//! work over the parsed form.
use picloud_shared::{HostKind, PathKind};
use thiserror::Error;
// ----------------------------------------------------------------------------
// Errors
// ----------------------------------------------------------------------------
#[derive(Debug, Clone, Error, PartialEq, Eq)]
pub enum ParseError {
#[error("path must not be empty")]
EmptyPath,
#[error("path must start with '/'")]
PathMissingLeadingSlash,
#[error("prefix path must end with '/*'")]
PrefixMissingTail,
#[error(
"param path must contain at least one ':name' segment (use kind=exact for literal paths)"
)]
ParamWithoutCaptures,
#[error("':' must start a whole path segment; bad segment: {0:?}")]
MidSegmentParam(String),
#[error("empty param name in segment {0:?}")]
EmptyParamName(String),
#[error("invalid param name {0:?}: must match [a-zA-Z_][a-zA-Z0-9_]*")]
InvalidParamName(String),
#[error("duplicate param name {0:?} within a single route")]
DuplicateParamName(String),
#[error("path '{0}' is reserved for system use")]
ReservedPath(String),
#[error("'{{name}}' syntax is reserved for a future SDK release; use ':name' for whole-segment params")]
ReservedBraceSyntax,
#[error("host must not be empty")]
EmptyHost,
#[error("wildcard host must start with '*.'")]
WildcardMissingPrefix,
#[error("wildcard suffix must not be empty (got just '*')")]
EmptyWildcardSuffix,
#[error("'{{name}}.example.com' subdomain capture is reserved for a future SDK release; use '*.example.com'")]
ReservedHostBraceSyntax,
}
// ----------------------------------------------------------------------------
// Path patterns
// ----------------------------------------------------------------------------
/// Path-side prefixes that user routes are not allowed to bind to.
/// Posting a route with a path beginning with any of these returns 422.
pub const RESERVED_PATH_PREFIXES: &[&str] = &["/api/", "/admin/", "/healthz", "/version"];
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum PathPattern {
Exact(String),
/// Stored with the trailing slash (no `*`). Matches `path.starts_with(prefix)`.
Prefix(String),
Param(Vec<PathSegment>),
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum PathSegment {
Literal(String),
Param(String),
}
impl PathPattern {
/// Number of leading literal segments before the first variable segment.
/// Used as the primary specificity score at match time.
#[must_use]
pub fn leading_literal_count(&self) -> usize {
match self {
Self::Exact(_) => usize::MAX, // exact wins absolute
Self::Prefix(p) => p.split('/').filter(|s| !s.is_empty()).count(),
Self::Param(segs) => segs
.iter()
.take_while(|s| matches!(s, PathSegment::Literal(_)))
.count(),
}
}
}
/// Parse a stored path. The `kind` selects validation rules; the same
/// raw string can be valid under multiple kinds (e.g., `/greet/:name`
/// parses as `Exact` with a literal `:` or as `Param` with a capture).
pub fn parse_path(kind: PathKind, raw: &str) -> Result<PathPattern, ParseError> {
if raw.is_empty() {
return Err(ParseError::EmptyPath);
}
if !raw.starts_with('/') {
return Err(ParseError::PathMissingLeadingSlash);
}
if raw.contains('{') || raw.contains('}') {
return Err(ParseError::ReservedBraceSyntax);
}
check_reserved(raw)?;
match kind {
PathKind::Exact => Ok(PathPattern::Exact(raw.to_string())),
PathKind::Prefix => parse_prefix(raw),
PathKind::Param => parse_param(raw),
}
}
fn check_reserved(raw: &str) -> Result<(), ParseError> {
for r in RESERVED_PATH_PREFIXES {
if raw == r.trim_end_matches('/') || raw.starts_with(r) {
return Err(ParseError::ReservedPath(raw.to_string()));
}
}
Ok(())
}
fn parse_prefix(raw: &str) -> Result<PathPattern, ParseError> {
let stripped = raw
.strip_suffix("/*")
.ok_or(ParseError::PrefixMissingTail)?;
// Normalize: store with trailing `/` so starts_with matches exactly
// the "more under here" semantic. `/greet/*` → "/greet/".
let prefix = if stripped.is_empty() {
"/".to_string()
} else {
format!("{stripped}/")
};
Ok(PathPattern::Prefix(prefix))
}
fn parse_param(raw: &str) -> Result<PathPattern, ParseError> {
let mut segments = Vec::new();
let mut seen_names = Vec::new();
let mut any_param = false;
// Skip the leading '/'; trailing '/' (other than the root case)
// would create an empty segment and is rejected.
for seg in raw.trim_start_matches('/').split('/') {
if seg.is_empty() {
return Err(ParseError::MidSegmentParam(raw.to_string()));
}
if let Some(rest) = seg.strip_prefix(':') {
// The chars after `:` must be the WHOLE name and nothing
// else. Distinguish "name starts with digit" (an invalid
// identifier as such) from "name has trailing non-ident
// characters" (mixing :name with literal text — what we
// call mid-segment).
if rest.is_empty() {
return Err(ParseError::EmptyParamName(seg.to_string()));
}
let bad_first =
!(rest.chars().next().unwrap().is_ascii_alphabetic() || rest.starts_with('_'));
let all_ident_chars = rest.chars().all(|c| c.is_ascii_alphanumeric() || c == '_');
if !all_ident_chars {
return Err(ParseError::MidSegmentParam(seg.to_string()));
}
if bad_first {
return Err(ParseError::InvalidParamName(rest.to_string()));
}
if seen_names.iter().any(|n: &String| n == rest) {
return Err(ParseError::DuplicateParamName(rest.to_string()));
}
seen_names.push(rest.to_string());
segments.push(PathSegment::Param(rest.to_string()));
any_param = true;
} else if seg.contains(':') {
return Err(ParseError::MidSegmentParam(seg.to_string()));
} else {
segments.push(PathSegment::Literal(seg.to_string()));
}
}
if !any_param {
return Err(ParseError::ParamWithoutCaptures);
}
Ok(PathPattern::Param(segments))
}
// ----------------------------------------------------------------------------
// Host patterns
// ----------------------------------------------------------------------------
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum HostPattern {
Any,
Strict(String),
/// Suffix WITHOUT the leading `*.`; the matcher accepts any
/// non-empty label followed by `.suffix`.
Wildcard {
suffix: String,
capture: Option<String>,
},
}
impl HostPattern {
/// Specificity rank for host dispatch (higher wins).
#[must_use]
pub fn specificity(&self) -> HostSpecificity {
match self {
Self::Strict(_) => HostSpecificity::Strict,
Self::Wildcard { suffix, .. } => HostSpecificity::Wildcard(suffix.len()),
Self::Any => HostSpecificity::Any,
}
}
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
pub enum HostSpecificity {
Any,
/// Longer wildcard suffix = more specific.
Wildcard(usize),
Strict,
}
pub fn parse_host(
kind: HostKind,
raw: &str,
capture: Option<&str>,
) -> Result<HostPattern, ParseError> {
match kind {
HostKind::Any => Ok(HostPattern::Any),
HostKind::Strict => {
if raw.is_empty() {
return Err(ParseError::EmptyHost);
}
if raw.contains('{') || raw.contains('}') {
return Err(ParseError::ReservedHostBraceSyntax);
}
Ok(HostPattern::Strict(raw.to_string()))
}
HostKind::Wildcard => {
if raw.is_empty() {
return Err(ParseError::EmptyHost);
}
if raw.contains('{') || raw.contains('}') {
return Err(ParseError::ReservedHostBraceSyntax);
}
let suffix = raw
.strip_prefix("*.")
.ok_or(ParseError::WildcardMissingPrefix)?;
if suffix.is_empty() {
return Err(ParseError::EmptyWildcardSuffix);
}
Ok(HostPattern::Wildcard {
suffix: suffix.to_string(),
capture: capture.map(str::to_string),
})
}
}
}
// ----------------------------------------------------------------------------
// Tests
// ----------------------------------------------------------------------------
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn exact_takes_any_string() {
let p = parse_path(PathKind::Exact, "/greet").unwrap();
assert_eq!(p, PathPattern::Exact("/greet".into()));
// Even one with a colon — exact stores it literally.
let p2 = parse_path(PathKind::Exact, "/greet/:name").unwrap();
assert_eq!(p2, PathPattern::Exact("/greet/:name".into()));
}
#[test]
fn prefix_normalizes_trailing_slash() {
let p = parse_path(PathKind::Prefix, "/greet/*").unwrap();
assert_eq!(p, PathPattern::Prefix("/greet/".into()));
}
#[test]
fn prefix_rejects_no_trailing_star() {
let e = parse_path(PathKind::Prefix, "/greet").unwrap_err();
assert_eq!(e, ParseError::PrefixMissingTail);
}
#[test]
fn prefix_at_root() {
let p = parse_path(PathKind::Prefix, "/*").unwrap();
assert_eq!(p, PathPattern::Prefix("/".into()));
}
#[test]
fn param_parses_segments_and_captures_names() {
let p = parse_path(PathKind::Param, "/users/:id/posts/:post").unwrap();
match p {
PathPattern::Param(segs) => {
assert_eq!(segs.len(), 4);
assert_eq!(segs[0], PathSegment::Literal("users".into()));
assert_eq!(segs[1], PathSegment::Param("id".into()));
assert_eq!(segs[2], PathSegment::Literal("posts".into()));
assert_eq!(segs[3], PathSegment::Param("post".into()));
}
_ => panic!("expected Param"),
}
}
#[test]
fn param_rejects_mid_segment_colon() {
let e = parse_path(PathKind::Param, "/greet/my:name").unwrap_err();
assert!(matches!(e, ParseError::MidSegmentParam(_)));
let e2 = parse_path(PathKind::Param, "/greet/:name.json").unwrap_err();
assert!(matches!(e2, ParseError::MidSegmentParam(_)));
}
#[test]
fn param_rejects_duplicate_names() {
let e = parse_path(PathKind::Param, "/users/:id/posts/:id").unwrap_err();
assert_eq!(e, ParseError::DuplicateParamName("id".into()));
}
#[test]
fn param_rejects_empty_or_invalid_names() {
assert!(matches!(
parse_path(PathKind::Param, "/greet/:"),
Err(ParseError::EmptyParamName(_))
));
assert!(matches!(
parse_path(PathKind::Param, "/greet/:9id"),
Err(ParseError::InvalidParamName(_))
));
}
#[test]
fn param_requires_at_least_one_capture() {
let e = parse_path(PathKind::Param, "/just/literal").unwrap_err();
assert_eq!(e, ParseError::ParamWithoutCaptures);
}
#[test]
fn rejects_brace_syntax_reserved() {
let e = parse_path(PathKind::Exact, "/x/{name}").unwrap_err();
assert_eq!(e, ParseError::ReservedBraceSyntax);
}
#[test]
fn rejects_reserved_paths() {
for raw in [
"/api/v1/admin/scripts",
"/api/v2/foo",
"/admin/dashboard",
"/healthz",
"/version",
] {
let e = parse_path(PathKind::Exact, raw).unwrap_err();
assert!(
matches!(e, ParseError::ReservedPath(_)),
"expected reserved for {raw:?}, got {e:?}"
);
}
}
#[test]
fn rejects_missing_leading_slash() {
let e = parse_path(PathKind::Exact, "greet").unwrap_err();
assert_eq!(e, ParseError::PathMissingLeadingSlash);
}
#[test]
fn host_strict_and_wildcard() {
assert_eq!(
parse_host(HostKind::Strict, "sub.example.com", None).unwrap(),
HostPattern::Strict("sub.example.com".into())
);
assert_eq!(
parse_host(HostKind::Wildcard, "*.example.com", None).unwrap(),
HostPattern::Wildcard {
suffix: "example.com".into(),
capture: None
}
);
assert_eq!(
parse_host(HostKind::Any, "", None).unwrap(),
HostPattern::Any
);
}
#[test]
fn host_rejects_wildcard_without_star_dot() {
let e = parse_host(HostKind::Wildcard, "example.com", None).unwrap_err();
assert_eq!(e, ParseError::WildcardMissingPrefix);
}
#[test]
fn host_rejects_brace_syntax() {
let e = parse_host(HostKind::Wildcard, "{tenant}.example.com", None).unwrap_err();
assert_eq!(e, ParseError::ReservedHostBraceSyntax);
}
#[test]
fn leading_literal_count_works() {
let exact = parse_path(PathKind::Exact, "/foo/users").unwrap();
assert_eq!(exact.leading_literal_count(), usize::MAX);
let prefix2 = parse_path(PathKind::Prefix, "/foo/users/*").unwrap();
assert_eq!(prefix2.leading_literal_count(), 2);
let prefix1 = parse_path(PathKind::Prefix, "/foo/*").unwrap();
assert_eq!(prefix1.leading_literal_count(), 1);
let param2 = parse_path(PathKind::Param, "/foo/users/:id").unwrap();
assert_eq!(param2.leading_literal_count(), 2);
let param1 = parse_path(PathKind::Param, "/foo/:section").unwrap();
assert_eq!(param1.leading_literal_count(), 1);
let param_then_lit = parse_path(PathKind::Param, "/foo/:s/list").unwrap();
// Literal AFTER a param doesn't count for "leading literal".
assert_eq!(param_then_lit.leading_literal_count(), 1);
}
}

View File

@@ -0,0 +1,43 @@
//! In-memory snapshot of compiled routes, shared by manager (writes)
//! and orchestrator (reads).
//!
//! Holds an `arc-swap`-style lock-free hand-off so the dispatcher can
//! read without contending against the writer; in MVP-single-process
//! we just use `RwLock` and accept the cheap contention.
use std::sync::RwLock;
use super::matcher::{r#match, CompiledRoute, MatchResult};
#[derive(Default)]
pub struct RouteTable {
inner: RwLock<Vec<CompiledRoute>>,
}
impl RouteTable {
#[must_use]
pub fn new() -> Self {
Self::default()
}
/// Replace the whole table atomically. The manager calls this after
/// each successful route CRUD operation (by re-reading from DB).
pub fn replace(&self, routes: Vec<CompiledRoute>) {
let mut guard = self.inner.write().expect("route table poisoned");
*guard = routes;
}
/// Dispatch a request to a matching route, or `None`.
#[must_use]
pub fn match_request(&self, host: &str, method: &str, path: &str) -> Option<MatchResult> {
let guard = self.inner.read().expect("route table poisoned");
r#match(guard.iter(), host, method, path)
}
/// Returns a clone of the currently compiled routes; intended for
/// the dashboard's "list routes" admin endpoint.
#[must_use]
pub fn snapshot(&self) -> Vec<CompiledRoute> {
self.inner.read().expect("route table poisoned").clone()
}
}