Files
PiCloud/serverless_cloud_blueprint.md
MechaCat02 b8b544816d chore: initial scaffold — workspace, docs, blueprint
Sets up the PiCloud monorepo as a Cargo workspace organised around the
three-service architecture (manager / orchestrator / executor), each
backed by a *-core library crate so the same logic powers both the MVP
all-in-one `picloud` binary and the future split-process cluster mode.

  * crates/shared, executor-core, orchestrator-core, manager-core
    define the library surface and trait seams between the three
    services (`ExecutorClient`, `ScriptResolver`, `ScriptRepository`).
  * crates/picloud is the MVP entrypoint; serves /healthz on 8080
    (override via PICLOUD_BIND).
  * crates/picloud-{manager,orchestrator,executor} are skeleton
    binaries that keep the crate boundaries honest until cluster
    mode is built out in v1.3+.
  * docs/git-workflow.md defines the trunk-based workflow:
    short-lived branches, Conventional Commits, separate hotfix
    flow with mandatory reproduction tests.
  * CLAUDE.md captures the working rules for future Claude sessions.

Workspace passes `cargo fmt`, `cargo clippy -D warnings` (with
pedantic enabled), and `cargo test --workspace`. The all-in-one
binary responds on `/healthz` and `/`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 23:16:32 +02:00

1346 lines
40 KiB
Markdown

# Project Blueprint: Lightweight Event-Based Serverless Cloud
**Status**: Phase 4 — Blueprint Complete
**Last Updated**: 2026-04-10
**Audience**: Solo developer (DIY self-hosted)
---
## 1. Project Overview
### Vision
A lightweight, self-hosted, event-driven compute platform that allows developers to deploy and trigger Rhai scripts via HTTP endpoints. Scripts run in isolated containers, scale to zero when idle, and return structured responses. Optimized for resource efficiency on consumer hardware (< 100 functions).
### Core Value Proposition
- **Simple deployment**: Upload a Rhai script, get an HTTP endpoint
- **Minimal overhead**: Containers spawn on-demand, no persistent services running
- **DIY-friendly**: Run on modest hardware (single server, RPi-adjacent)
- **Extensible**: Pluggable storage, compute, and messaging later
### MVP Scope
**In Scope:**
- Dashboard: script upload + metadata (name, description, version, config)
- REST API: script CRUD operations
- HTTP-triggered script execution
- Request → Rhai script → JSON response
- PostgreSQL for script storage
- Docker for isolated execution
- Execution logs and basic observability
**Out of Scope (v1.1+):**
- Queue-based triggers
- Scheduled jobs (cron)
- Multi-user/projects
- External HTTP calls from scripts
- Metrics dashboards
- Secrets management
- Script versioning/rollback
### Success Criteria
1. Deploy a Rhai script in < 1 minute
2. Script responds to HTTP requests within 500ms (p95)
3. Runs on single modest server (2GB RAM, dual-core CPU)
4. No background services consume CPU when idle
---
## 2. Architecture Overview
### High-Level System Diagram
```
┌─────────────────────────────────────────────────────────────────┐
│ Self-Hosted Server │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────┐ ┌──────────────────────┐ │
│ │ Web Dashboard │ │ Orchestrator API │ │
│ │ (Alpine.js SPA) │ │ (Rust + Axum) │ │
│ │ Port 3000 │ │ Port 8080 │ │
│ └──────┬───────────────┘ └──────────┬───────────┘ │
│ │ │ │
│ │ Upload script │ HTTP requests │
│ │ Manage scripts │ Script metadata │
│ │ │ │
│ └────────────────┬────────────────────┘ │
│ │ │
│ ┌───────▼────────┐ │
│ │ PostgreSQL │ │
│ │ (scripts, MD) │ │
│ └────────────────┘ │
│ │ │
│ ┌────────────────┼────────────────┐ │
│ │ │ │ │
│ ┌────▼────┐ ┌────▼────┐ ┌────▼────┐ │
│ │Container │ │Container │ │Container │ │
│ │ Instance │ │ Instance │ │ Instance │ (on-demand) │
│ │(Rhai Ex.)│ │(Rhai Ex.)│ │(Rhai Ex.)│ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ │ │ │ │
│ └─────────────────┼────────────────┘ │
│ │ │
│ ┌────────▼────────┐ │
│ │ Docker Daemon │ │
│ │ (container mgmt) │ │
│ └─────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
```
### Data Flow: HTTP Request → Response
1. **HTTP Request** arrives at Orchestrator (`POST /api/execute/{script_id}`)
2. **Orchestrator** fetches script from PostgreSQL
3. **Docker daemon** spawns container from pre-built executor image
4. **Container startup** loads script into Rhai runtime + passes request context
5. **Rhai script** executes, processes request, returns JSON object
6. **Orchestrator** extracts `statusCode`, `headers`, `body` from response
7. **HTTP Response** sent to client
8. **Container** is destroyed (scale to zero)
---
## 3. Core Components
### 3.1 Orchestrator Service
**Language**: Rust
**Framework**: Axum
**Port**: 8080 (default)
**Responsibilities:**
- HTTP server (REST API for script management + trigger)
- Script lifecycle: fetch, validate, store
- Container orchestration: spawn, monitor, cleanup
- Request/response marshalling
- Error handling & logging
**Key Endpoints (MVP):**
- `POST /api/scripts` — upload script
- `GET /api/scripts` — list all scripts
- `DELETE /api/scripts/{id}` — delete script
- `POST /api/execute/{script_id}` — trigger script execution (with request body/headers)
**Internal Tasks:**
- Periodically clean up orphaned containers (optional, for MVP just GC on startup)
- Log execution events to stdout/logs
---
### 3.2 Executor Container Image
**Base**: `alpine:latest`
**Contents**:
- Rhai runtime (compiled binary or via package manager)
- Minimal libc (musl on Alpine)
- Script loader + executor wrapper
- Logging utilities
**Startup Flow:**
```bash
# Pseudo-code
SCRIPT_CONTENT=$(passed via env var or stdin)
SCRIPT_PATH=/tmp/script.rhai
echo "$SCRIPT_CONTENT" > $SCRIPT_PATH
REQUEST_JSON=$(read from stdin or env)
rhai_executor --script $SCRIPT_PATH --request "$REQUEST_JSON"
```
**Output**: JSON response to stdout, captured by Orchestrator
---
### 3.3 Dashboard (Web UI)
**Framework**: Alpine.js (MVP), Svelte (v1.0+)
**Port**: 3000 (default)
**Features (MVP):**
- Script upload form (file picker or textarea)
- Script metadata input (name, description, version, config)
- Config fields: timeout (s), memory limit (MB), enabled service access (DB/S3/queue/functions)
- List of deployed scripts
- Simple "Deploy" / "Delete" actions
**Technology Stack:**
- HTML + CSS + Alpine.js
- Fetch API to call Orchestrator
- No build step (initially), just serve static files
---
### 3.4 PostgreSQL Database
**Schema (MVP):**
```sql
CREATE TABLE scripts (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL,
description TEXT,
version INT DEFAULT 1,
script_content TEXT NOT NULL,
-- Config
timeout_seconds INT DEFAULT 30,
memory_limit_mb INT DEFAULT 256,
-- Service access (MVP: unused, future)
access_db BOOLEAN DEFAULT false,
access_s3 BOOLEAN DEFAULT false,
access_queue BOOLEAN DEFAULT false,
access_functions BOOLEAN DEFAULT false,
-- Metadata
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW(),
-- Execution tracking (MVP: optional)
last_executed_at TIMESTAMP,
execution_count INT DEFAULT 0
);
CREATE TABLE execution_logs (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
script_id UUID REFERENCES scripts(id) ON DELETE CASCADE,
request_path TEXT,
request_headers JSONB,
request_body JSONB,
response_code INT,
response_body JSONB,
logs TEXT,
duration_ms INT,
status TEXT, -- 'success', 'timeout', 'error', etc.
created_at TIMESTAMP DEFAULT NOW()
);
```
**Rationale:**
- Simple, relational structure
- `execution_logs` for audit trail + debugging (can be pruned later)
- JSONB for flexible config/response storage
---
## 4. Data Model
### Script Entity
```json
{
"id": "uuid",
"name": "Process Payment",
"description": "Webhook handler for payment processor",
"version": 1,
"script_content": "let req = request();\nlet amt = req.body.amount;\n{ statusCode: 200, body: { processed: amt } }",
"timeout_seconds": 30,
"memory_limit_mb": 256,
"access_db": false,
"access_s3": false,
"access_queue": false,
"access_functions": false,
"interceptors": {
"s3": { "before_write": false },
"documents": { "before_create": false },
"queue": { "before_send": false }
},
"created_at": "2026-04-10T12:00:00Z",
"updated_at": "2026-04-10T12:00:00Z",
"last_executed_at": "2026-04-10T12:05:00Z",
"execution_count": 42
}
```
### Execution Log Entity
```json
{
"id": "uuid",
"script_id": "uuid",
"request_path": "/api/execute/script-123",
"request_headers": { "content-type": "application/json" },
"request_body": { "amount": 100 },
"response_code": 200,
"response_body": { "processed": 100 },
"logs": "[12:05:10] Script started\n[12:05:11] Processing...",
"duration_ms": 145,
"status": "success",
"created_at": "2026-04-10T12:05:11Z"
}
```
---
## 5. API Specification (MVP)
### 5.1 Upload Script
```
POST /api/scripts
Content-Type: application/json
{
"name": "string",
"description": "string",
"script_content": "string",
"timeout_seconds": 30,
"memory_limit_mb": 256
}
Response: 201 Created
{
"id": "uuid",
"name": "...",
...
}
```
### 5.2 List Scripts
```
GET /api/scripts
Response: 200 OK
[
{ id: "...", name: "...", ... },
{ id: "...", name: "...", ... }
]
```
### 5.3 Delete Script
```
DELETE /api/scripts/{script_id}
Response: 204 No Content
```
### 5.4 Execute Script (via HTTP Endpoint)
```
POST /api/execute/{script_id}
Content-Type: application/json
[any headers]
[any request body]
Response: [script-returned status code]
{
"..." : "..."
}
```
**Notes:**
- Script receives full HTTP request (path, headers, body)
- Response is script's JSON object (assumes `{ statusCode, headers, body }`)
- On error (timeout, crash): `{ statusCode: 500, body: "Server error" }`
---
## 6. Rhai SDK (MVP Stub)
For MVP, scripts have access to:
### Core Request/Response
- **ctx object**: Contains execution metadata + request data (see below)
- **Return value**: `{ statusCode: int, headers: object, body: object }`
### Context Object (Available Globally)
```rhai
// Execution metadata
ctx.execution_id // UUID of this execution
ctx.script_id // UUID of the script being run
ctx.script_name // Name of the script
ctx.request_id // Request ID for tracing
ctx.trace_id // For call graphs (v1.2+)
ctx.invocation_type // 'http', 'function', 'scheduled', etc.
ctx.parent_execution_id // For function hierarchies (v1.2+)
// Request context
ctx.request.path // HTTP path
ctx.request.headers // HTTP headers object
ctx.request.body // Request body (parsed JSON or raw)
```
### Structured Logging (v1.0+)
```rhai
log.info("Processing order", { order_id: 123, user: "alice" });
log.warn("Rate limit approaching", { remaining: 10 });
log.error("Payment failed", { error: "timeout", retry_count: 2 });
log.debug("Internal state", { state: { ... } });
```
**Output**: Captured in execution logs, searchable in dashboard
### Error Handling & Retry (v1.1+)
```rhai
// Retry a function with exponential backoff
let result = retry::call(
|| { invoke("process-data", { item: 123 }) },
{
max_attempts: 3,
backoff: "exponential", // or "linear"
initial_delay_ms: 100,
max_delay_ms: 5000
}
);
// Retry an HTTP call
let response = retry::http_call(
|| { http.post("https://api.example.com/webhook", body) },
{
max_attempts: 5,
backoff: "exponential",
on_retry: |attempt, error| {
log.warn("Retry attempt", { attempt, error });
}
}
);
// Manual error handling
try {
let data = invoke("might-fail", {});
} catch err {
log.error("Invocation failed", { error: err });
return { statusCode: 500, body: { error: "Service unavailable" } };
}
```
---
## 6.1 Future: Document Schema Validation (v1.2+)
For documents, allow optional **schema definitions** similar to MongoDB:
```rhai
// Define schema when creating
docs.create("users",
{ name: "Alice", email: "alice@example.com" },
{
schema: {
name: "string",
email: "string",
age: "number?", // optional
tags: "array"
}
}
);
// Validate before update
docs.update("users", user_id,
{ age: 31 },
{ schema: { age: "number" } }
);
```
---
## 6.2 Example Script: Full SDK Usage
```rhai
// Get execution and request context
let user_id = ctx.request.body.user_id;
// Log start
log.info("Processing request", {
script: ctx.script_name,
execution_id: ctx.execution_id
});
// Call another function with retry
let user_data = retry::call(
|| { invoke("fetch-user", { id: user_id }) },
{ max_attempts: 2, backoff: "linear" }
);
if user_data.statusCode != 200 {
log.error("Failed to fetch user", { response: user_data });
return { statusCode: 500, body: { error: "User fetch failed" } };
}
// Store in KV cache
kv.set("user-cache", `user:${user_id}`, user_data.body, 3600);
// Store in documents
let doc = docs.create("user-requests", {
user_id: user_id,
request_at: "2026-04-10T12:00:00Z",
status: "processed"
});
// Log completion
log.info("Request processed", {
doc_id: doc,
user_id: user_id
});
return {
statusCode: 200,
headers: { "Content-Type": "application/json" },
body: { user: user_data.body, cached: true }
};
```
### 8.4 User Management Service
**Purpose**: Built-in user authentication, management, and invitations with secure password handling.
**PostgreSQL Schema:**
```sql
CREATE EXTENSION IF NOT EXISTS pgcrypto; -- For password hashing
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email TEXT NOT NULL UNIQUE,
password_hash TEXT NOT NULL,
password_salt TEXT NOT NULL,
-- Profile
name TEXT,
locked BOOLEAN DEFAULT false,
-- Roles & Permissions
roles TEXT[] DEFAULT '{}', -- e.g., ["admin", "moderator"]
permissions JSONB DEFAULT '{}', -- Custom permissions structure
-- Metadata
metadata JSONB DEFAULT '{}', -- Custom user data (profile pic URL, preferences, etc.)
-- Audit
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW(),
last_login_at TIMESTAMP,
last_password_change_at TIMESTAMP
);
-- Invitations & password reset tokens
CREATE TABLE user_tokens (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES users(id) ON DELETE CASCADE,
token_type TEXT NOT NULL, -- 'invite', 'password_reset', 'login_link'
token_hash TEXT NOT NULL UNIQUE,
expires_at TIMESTAMP NOT NULL,
used_at TIMESTAMP,
created_at TIMESTAMP DEFAULT NOW()
);
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_user_tokens_user_id ON user_tokens(user_id);
CREATE INDEX idx_user_tokens_type ON user_tokens(token_type);
```
**Rhai SDK (v1.1+):**
```rhai
// ===== CREATE & INVITE =====
// Create user with password
let user_id = users.create({
email: "alice@example.com",
password: "secure-password",
name: "Alice Smith",
roles: ["user"],
metadata: { profile_pic: "https://..." }
});
// Send invite link (creates token, sends email)
users.send_invite(email) → { token_sent: true, expires_in_days: 7 }
// Set password from invite/reset token
users.set_password_from_token(token, new_password) → { user_id, success: true }
// ===== AUTHENTICATION =====
// Authenticate user
let user = users.authenticate(email, password);
if user {
let user_id = user.id;
let roles = user.roles;
} else {
// Authentication failed
}
// Send password reset link
users.send_password_reset(email) → { sent: true, expires_in_hours: 24 }
// Send login link (passwordless)
users.send_login_link(email) → { sent: true, expires_in_minutes: 15 }
// Verify login link token
let user = users.verify_login_token(token);
// ===== READ & SEARCH =====
// Get user by ID
let user = users.get(user_id);
// Find user by email
let user = users.find_by_email("alice@example.com");
// Search users
let results = users.search({
query: "alice", // Searches email, name
limit: 50,
offset: 0
});
// List users with filtering
let users_list = users.list({
roles: ["admin"], // Filter by roles
locked: false, // Include/exclude locked users
limit: 100,
offset: 0
});
// ===== UPDATE =====
// Update user data (except password)
users.update(user_id, {
name: "Alice Johnson",
roles: ["user", "moderator"],
metadata: { theme: "dark", notifications: true }
});
// Update password (requires old password or token)
users.update_password(user_id, old_password, new_password)
→ { success: true } or { error: "Wrong password" }
// ===== LOCK & DELETE =====
// Lock user (disable login)
users.lock(user_id) → { success: true }
// Unlock user
users.unlock(user_id) → { success: true }
// Delete user
users.delete(user_id) → { success: true }
// ===== PERMISSIONS & ROLES =====
// Check if user has role
if users.has_role(user_id, "admin") {
// Allow admin action
}
// Check if user has permission
if users.has_permission(user_id, "posts:delete") {
// Allow deletion
}
// Grant role to user
users.add_role(user_id, "moderator");
// Revoke role
users.remove_role(user_id, "moderator");
// Set custom permissions
users.set_permissions(user_id, {
"posts:create": true,
"posts:delete": false,
"comments:moderate": true
});
```
**User Object (returned from get/auth/find):**
```json
{
"id": "uuid",
"email": "alice@example.com",
"name": "Alice Smith",
"roles": ["user", "moderator"],
"permissions": { "posts:create": true },
"metadata": { "theme": "dark" },
"locked": false,
"created_at": "2026-04-10T12:00:00Z",
"updated_at": "2026-04-10T12:05:00Z",
"last_login_at": "2026-04-10T11:55:00Z"
}
```
**Use Cases:**
- User registration with email verification
- Login flows (password or passwordless)
- Password reset flows
- Role-based access control (RBAC)
- User search/directory
- Account management (lock, delete)
---
| Layer | Technology | Rationale |
|-------|-----------|-----------|
| **Orchestrator** | Rust + Axum | Performance, safety, async-first; minimal overhead |
| **Dashboard** | Alpine.js + vanilla HTML/CSS | Zero dependencies, simple to deploy, fast enough for MVP |
| **Database** | PostgreSQL + hstore | Robust ACID database; hstore extension for lightweight KV (v1.1) |
| **Container Runtime** | Docker (Docker daemon) | Industry standard, simple CLI |
| **Executor Image** | Alpine Linux + Rhai | Minimal image size (~50-100MB), fast startup |
| **Scripting** | Rhai | Lightweight, embedded-friendly, safe by default |
| **Deployment** | Docker Compose (local) / systemd (production) | Simple multi-service orchestration |
---
## 11. Deployment Model (MVP)
### Local Development
```bash
# Clone repo
git clone <repo> serverless-cloud
cd serverless-cloud
# Start all services (Orchestrator + Dashboard + Postgres)
docker-compose up
# Dashboard: http://localhost:3000
# Orchestrator: http://localhost:8080
```
### Production (Single Server)
```bash
# On target machine:
# 1. Install Docker, Docker Compose
# 2. Deploy docker-compose.yml
# 3. Optionally: use systemd service to auto-restart on reboot
docker-compose -f docker-compose.prod.yml up -d
```
### docker-compose.yml (MVP Template)
```yaml
version: '3.8'
services:
postgres:
image: postgres:15-alpine
environment:
POSTGRES_DB: serverless
POSTGRES_USER: app
POSTGRES_PASSWORD: changeme
volumes:
- postgres_data:/var/lib/postgresql/data
ports:
- "5432:5432"
orchestrator:
build: ./orchestrator
environment:
DATABASE_URL: postgres://app:changeme@postgres:5432/serverless
DOCKER_HOST: unix:///var/run/docker.sock
ports:
- "8080:8080"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
dashboard:
image: nginx:alpine
volumes:
- ./dashboard/dist:/usr/share/nginx/html
ports:
- "3000:80"
volumes:
postgres_data:
```
---
## 12. Development Roadmap
### Phase 1: MVP ✓ (Current)
- [x] Orchestrator: REST API for script CRUD + execute
- [x] Executor image: load + run Rhai script
- [x] Dashboard: upload script, deploy, delete
- [x] PostgreSQL: script storage + execution logs
- [ ] **Timeline**: 4-6 weeks
**Deliverables:**
- Docker image for executor
- Rust binary (Orchestrator)
- Static HTML + Alpine.js dashboard
- docker-compose.yml for local/prod deployment
---
### Phase 2: v1.0 (Polish & Usability)
- Script versioning + rollback
- Execution history dashboard (view logs, timings, errors)
- Better error messages (script parse errors, timeouts)
- Timeout/resource limit enforcement
- Container cleanup/GC
- Rhai SDK: `request()` function fully documented
**Timeline**: 2-3 weeks
---
### Phase 3: v1.1 (Expand Capabilities & Services)
- Queue-based triggers (RabbitMQ / Redis)
- Scheduled jobs (cron syntax)
- Secrets management (encrypted env vars)
- **Rhai SDK: KV Store** (`kv.get()`, `kv.set()`, `kv.delete()` with collections)
- **Rhai SDK: Document Store** (`docs.create()`, `docs.find()`, `docs.update()`, `docs.delete()` with schema validation)
- **Rhai SDK: User Management** (auth, CRUD, roles, permissions, invitations, password reset)
- **Rhai SDK: Email** (`email.send(to, subject, body)` via SMTP)
- Rhai SDK: `s3.*`, `queue.*`, `invoke()`, `retry.*()`
- External HTTP calls from scripts (`http.get()`, `http.post()`)
- Script versioning with automatic rollback on error
**Timeline**: 8-10 weeks
---
### Phase 4: v1.2 (Advanced Workflows & Hierarchies)
- Function workflows (DAG execution, conditional branching, error handling)
- Function hierarchy (parent/child invocation, sync/async calls)
- Nested workflows
- Call graph visualization + execution tracing
- Advanced query support for document store (`docs.query()` with filters)
**Timeline**: 6-8 weeks
---
### Phase 5: v1.3+ (Scaling, Security, Observability)
- Multi-user / project namespacing
- Rate limiting on endpoints
- Auth (API keys, dashboard login)
- Metrics + monitoring dashboard
- Container pooling / warm starts
- Distributed tracing (OpenTelemetry)
- Webhooks for execution events
- S3 integration (object storage reads/writes)
---
## 7. Complete Rhai SDK Reference (MVP → v1.1+)
### Storage & Data
| Component | Methods | Availability |
|-----------|---------|--------------|
| **KV Store** | `kv.get(collection, key)`, `kv.set(collection, key, value, ttl?)`, `kv.delete(collection, key)`, `kv.has(collection, key)` | v1.1 |
| **Documents** | `docs.create(collection, data, schema?)`, `docs.find(collection, id)`, `docs.update(collection, id, data, schema?)`, `docs.delete(collection, id)`, `docs.list(collection, opts?)`, `docs.query(collection, filter?)` | v1.1 |
| **S3** | `s3.get(key)`, `s3.put(key, data)`, `s3.delete(key)`, `s3.list(prefix?)` | v1.1 |
| **Users** | `users.create(data)`, `users.get(id)`, `users.find_by_email(email)`, `users.search(query, limit, offset)`, `users.list(filters)`, `users.update(id, data)`, `users.authenticate(email, password)`, `users.update_password(id, old, new)`, `users.lock/unlock(id)`, `users.delete(id)`, `users.send_invite(email)`, `users.send_password_reset(email)`, `users.send_login_link(email)`, `users.has_role/permission(id, role/perm)`, `users.add/remove_role(id, role)` | v1.1 |
### Communication
| Component | Methods | Availability |
|-----------|---------|--------------|
| **Email** | `email.send(to, subject, body)`, `email.send_html(to, subject, html, text?)` | v1.1 |
| **HTTP** | `http.get(url, opts?)`, `http.post(url, body, opts?)`, `http.put(...)`, `http.delete(...)` | v1.1 |
### Functions & Execution
| Component | Methods | Availability |
|-----------|---------|--------------|
| **Invoke** | `invoke(function_id, args, opts?)`, `invoke_async(function_id, args)` | v1.1 |
| **Queue** | `queue.send(queue_name, message)`, `queue.send_batch(queue_name, messages)` | v1.1 |
| **Retry** | `retry::call(fn, opts)`, `retry::http_call(fn, opts)` | v1.1 |
### Observability & Context
| Component | Methods | Availability |
|-----------|---------|--------------|
| **Logging** | `log.info(msg, data?)`, `log.warn(msg, data?)`, `log.error(msg, data?)`, `log.debug(msg, data?)` | v1.0 |
| **Context** | `context().execution_id()`, `context().script_id()`, `context().request_id()`, `context().trace_id()`, `context().invocation_type()`, `context().parent_execution_id()` | v1.0+ |
### Request/Response & Context
| Component | Structure | Availability |
|-----------|-----------|--------------|
| **ctx** (global) | `ctx.execution_id`, `ctx.script_id`, `ctx.script_name`, `ctx.request_id`, `ctx.trace_id`, `ctx.invocation_type`, `ctx.parent_execution_id`, `ctx.request.path`, `ctx.request.headers`, `ctx.request.body` | MVP+ |
| **Response** | Return `{ statusCode, headers?, body }` | MVP |
### 8.1 KV Store Service
**Purpose**: Simple key-value persistence organized by collections, shared across script invocations and scripts.
**PostgreSQL Setup:**
```sql
-- Enable hstore extension (one-time setup)
CREATE EXTENSION IF NOT EXISTS hstore;
-- Create KV table with collection support
CREATE TABLE kv_store (
collection TEXT NOT NULL,
key TEXT NOT NULL,
value hstore NOT NULL,
expires_at TIMESTAMP,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW(),
PRIMARY KEY (collection, key)
);
CREATE INDEX idx_kv_collection ON kv_store(collection);
CREATE INDEX idx_kv_expires ON kv_store(expires_at)
WHERE expires_at IS NOT NULL;
```
**Why hstore + collections?**
- Lightweight, purpose-built for key-value storage
- Collections allow logical grouping (e.g., `kv:sessions`, `kv:counters`, `kv:flags`)
- Faster than JSONB for simple KV use cases
- Built-in indexing support
- Keeps all data in one database (no Redis dependency)
**Rhai SDK:**
```rhai
// Get a value from a collection
let val = kv.get("sessions", "user:123"); // Returns object or null
// Set a value in a collection
kv.set("sessions", "user:123", { token: "abc", created: "2026-04-10" });
// Delete a key from a collection
kv.delete("sessions", "user:123");
// Set with TTL (seconds)
kv.set("sessions", "user:123", { token: "xyz" }, 3600);
// Check if key exists in a collection
if kv.has("sessions", "user:123") { ... }
// Use different collections for different purposes
kv.set("counters", "api:calls", 42);
kv.set("flags", "feature:beta", true);
kv.set("cache", "page:home", { html: "..." });
```
**Use Cases:**
- Cache frequently accessed data
- Store user session state
- Counters, flags, feature toggles
- Rate limiting state (hit counts)
---
### 8.2 Document Store Service
**Purpose**: Flexible NoSQL-like storage for complex JSON documents, organized by collections.
**PostgreSQL Schema:**
```sql
CREATE TABLE documents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
collection TEXT NOT NULL,
data JSONB NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW(),
UNIQUE(collection, id)
);
CREATE INDEX idx_docs_collection ON documents(collection);
CREATE INDEX idx_docs_data ON documents USING GIN(data);
```
**Rhai SDK:**
```rhai
// Create a document
let doc_id = docs.create("users", {
name: "Alice",
email: "alice@example.com",
tags: ["vip", "beta"]
});
// Find by ID
let user = docs.find("users", doc_id);
// Update document
docs.update("users", doc_id, {
last_login: "2026-04-10T12:00:00Z"
});
// Delete document
docs.delete("users", doc_id);
// Query by field (simple equality, v1.2+ advanced queries)
let admins = docs.query("users", { role: "admin" });
// List all in collection (with pagination)
let all_users = docs.list("users", { limit: 100, offset: 0 });
```
**Use Cases:**
- User profiles, orders, transactions
- Event log / audit trail
- Content (posts, articles, comments)
- Configuration documents
- Workflow state
---
### 8.3 Email Service
**Purpose**: Send outgoing emails via SMTP.
**Configuration (stored in orchestrator config):**
```yaml
email:
smtp_host: "smtp.gmail.com"
smtp_port: 587
smtp_user: "your-email@gmail.com"
smtp_password: "app-password" # Or from secrets manager
from_address: "noreply@yourdomain.com"
from_name: "Serverless Cloud"
```
**Rhai SDK:**
```rhai
// Simple send
email.send({
to: "user@example.com",
subject: "Welcome!",
body: "Hello, welcome to our service."
});
// HTML body
email.send({
to: "user@example.com",
subject: "Welcome!",
html: "<h1>Welcome!</h1><p>Hello user.</p>",
text: "Welcome! Hello user." // Fallback
});
// With CC, BCC, reply-to
email.send({
to: "user@example.com",
cc: "admin@example.com",
bcc: "archive@example.com",
reply_to: "support@example.com",
subject: "Notification",
body: "..."
});
// Template-like (basic string interpolation)
let name = req.body.name;
email.send({
to: req.body.email,
subject: `Welcome, ${name}!`,
body: `Hi ${name},\n\nWelcome to our service.`
});
```
**Use Cases:**
- Welcome emails on sign-up
- Notifications (password reset, order status)
- Alerts from scripts
- Digest emails from queued data
---
## 9. v1.2+ Future Vision: Workflows & Hierarchies
### 9.1 Function Workflows (DAG Execution)
**Concept**: Chain multiple functions together in a directed acyclic graph (DAG).
**Example:**
```
Function A (process raw data)
Function B (validate data)
Function C (store in DB + send notification)
```
**Workflow Definition (YAML, v1.2+):**
```yaml
name: "data-pipeline"
description: "Process, validate, store data"
steps:
- name: "process"
function: "process-raw-data"
input: "{{ trigger.body }}"
- name: "validate"
function: "validate-data"
input: "{{ steps.process.output }}"
on_error: "fail" # or "skip", "retry"
- name: "store"
function: "store-and-notify"
input: "{{ steps.validate.output }}"
timeout: 60
retry:
attempts: 3
backoff: "exponential"
output: "{{ steps.store.output }}"
```
**Features:**
- Sequential execution (A → B → C)
- Parallel execution (B & C in parallel after A)
- Conditional branching (if A succeeds, run B; else run C)
- Error handling (fail fast, skip, retry with backoff)
- Data passing between steps (output of A → input of B)
- Workflow state tracking + execution history
- Timeout per step + total timeout
**Schema (PostgreSQL):**
```sql
CREATE TABLE workflows (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL UNIQUE,
description TEXT,
definition JSONB NOT NULL, -- YAML parsed as JSON
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE workflow_executions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
workflow_id UUID REFERENCES workflows(id),
status TEXT, -- 'pending', 'running', 'success', 'failed'
steps_state JSONB, -- { "process": { output: ... }, "validate": { output: ... } }
error_message TEXT,
started_at TIMESTAMP,
completed_at TIMESTAMP
);
```
---
### 9.2 Function Hierarchy (Parent/Child Invocation)
**Concept**: Functions can invoke other functions and wait for results (like microservice calls).
**Example:**
```
Parent Function A
├─ Child Function B (sync call, waits)
├─ Child Function C (sync call, waits)
└─ Child Function D (async, fire-and-forget)
```
**Rhai SDK:**
```rhai
// Synchronous invoke (waits for result)
let result_b = invoke("function-b", { param: "value" });
let result_c = invoke("function-c", { param: "value" });
// Process results
if result_b.statusCode == 200 {
let data = result_b.body;
// ... process
}
// Asynchronous invoke (fire-and-forget)
invoke_async("function-d", { param: "value" });
// Invoke with timeout
let result = invoke("function-b", { param: "value" }, { timeout: 30 });
```
**Orchestrator Behavior:**
- Parent function execution starts container
- Child function invocation: spawn new container (nested execution)
- Sync: parent waits; async: parent continues
- Error handling: propagate up or catch locally
- Timeout cascading: child timeout ≤ parent timeout
**Call Graph Tracking:**
```
Function Execution Tree:
parent-func-exec-123
├─ child-b-exec-456 (sync, 200ms)
├─ child-c-exec-789 (sync, 500ms)
└─ child-d-exec-012 (async, initiated)
Total execution: 700ms (max of child times)
```
**Schema (PostgreSQL):**
```sql
ALTER TABLE execution_logs ADD COLUMN (
parent_execution_id UUID REFERENCES execution_logs(id),
invocation_type TEXT, -- 'http', 'parent_sync', 'parent_async'
call_depth INT DEFAULT 0 -- Track nesting level
);
CREATE INDEX idx_execution_parent ON execution_logs(parent_execution_id);
```
---
### 9.4 Service Interceptors & Middleware (v1.2+)
**Concept**: A script can act as middleware to intercept and validate/transform service operations before they execute.
**Use Cases:**
- Auth function intercepts S3 writes: validate user permissions
- Audit function intercepts document updates: log all mutations
- Rate-limiting function intercepts queue sends: enforce quotas
- Data validation function intercepts DB operations: enforce schema
**Script Configuration (at upload):**
```json
{
"name": "auth-interceptor",
"description": "Authorize S3 writes",
"version": 1,
"script_content": "...",
"interceptors": {
"s3": {
"before_write": true,
"before_read": false
},
"queue": {
"before_send": true
},
"documents": {
"before_create": true,
"before_update": true,
"before_delete": true
},
"kv": {
"before_set": false,
"before_delete": false
}
}
}
```
**Interceptor Script Execution:**
When another script calls `s3.put("bucket", "key", data)`:
1. Orchestrator checks if any interceptor is registered for `s3.before_write`
2. If yes, spawn interceptor script with context:
```rhai
ctx.operation = {
service: "s3",
action: "write",
bucket: "bucket",
key: "key",
caller_script_id: "...",
caller_execution_id: "..."
}
ctx.data = { ... } // The data being written
```
3. Interceptor script returns: `{ allowed: true/false, reason: "...", data: {...} }`
4. If `allowed: false`, reject the operation → error to caller
5. If `allowed: true`, use potentially modified `data` → execute `s3.put()`
**Interceptor Script Example:**
```rhai
// Auth interceptor for S3
let user_id = ctx.request.body.user_id;
let key = ctx.operation.key;
// Check if user owns this key
let allowed = kv.get("permissions", `user:${user_id}:s3:${key}`);
if allowed {
log.info("S3 write authorized", { user_id, key });
{
allowed: true,
data: ctx.data // Optionally transform/add metadata
}
} else {
log.warn("S3 write denied", { user_id, key });
{
allowed: false,
reason: "User does not have write permission"
}
}
```
**Availability Matrix (v1.2+):**
| Service | Before Operations |
|---------|------------------|
| **S3** | read, write, delete, list |
| **Documents** | create, read, update, delete, query |
| **KV** | set, get, delete |
| **Queue** | send, send_batch |
| **Email** | send |
| **HTTP** | get, post, put, delete |
| **Functions (invoke)** | call, call_async |
| **Users** | create, update, authenticate, lock, delete |
**Notes:**
- HTTP triggers have NO before interceptors (they're entry points)
- Interceptors are **per-script, opt-in** (scripts only intercept what they explicitly configure)
- Failed interceptors return `{ allowed: false }` → original caller gets error
- Interceptor failures are logged in audit trail
- **v1.3+ consideration**: Global policies / RBAC layer on top of interceptors
---
## 10. Open Questions & Notes
### Architecture
- [ ] **Container image caching**: Should we keep a warm executor image in memory between requests? (v1.1 optimization)
- [ ] **Script isolation**: Do we need process-level isolation beyond Docker (seccomp, AppArmor)?
- [ ] **Networking**: Can scripts initiate outbound connections? (deferred to v1.1)
### v1.1 Services
- [ ] **KV expiration**: Background cleanup task for expired keys, or lazy deletion?
- [ ] **Document queries**: Start with simple equality, or support complex filters (v1.2)?
- [ ] **Email retries**: If SMTP fails, retry strategy (exponential backoff)?
- [ ] **SMTP configuration**: Environment variables, config file, or dashboard UI?
- [ ] **User password hashing**: Use bcrypt, Argon2, or scrypt? What cost factor?
- [ ] **User invitations**: Email template customization? Configurable expiration?
- [ ] **Passwordless login**: Email-based or SMS-based login links?
- [ ] **Session management**: Sessions table for tracking login tokens/refresh tokens?
- [ ] **2FA/MFA**: In-scope for v1.1 or defer to v1.2?
### v1.2+ Workflows & Hierarchies
- [ ] **Workflow DAG format**: YAML, JSON, or domain-specific language (DSL)?
- [ ] **Branching logic**: Simple if/else, or complex conditions (switch/case)?
- [ ] **Workflow versioning**: Support multiple versions with rollback?
- [ ] **Call graph limits**: Max depth of nested function calls (prevent runaway recursion)?
- [ ] **Timeout cascading**: How strictly to enforce (child ≤ parent)?
- [ ] **Observability**: Generate trace IDs for call graphs, visualize in dashboard?
### v1.2+ Service Interceptors
- [ ] **Interceptor chaining**: If multiple scripts intercept same operation, execution order?
- [ ] **Performance**: Interceptor overhead on every service call — caching/optimization needed?
- [ ] **Interceptor failures**: If interceptor times out, fail the entire operation or allow bypass?
- [ ] **Circular dependencies**: Prevent interceptor A calling service that triggers interceptor B calling A?
- [ ] **Audit trail**: Log all interceptor decisions (allowed/denied) automatically?
- [ ] **Debugging**: How to trace interceptor execution in logs/dashboard?
### Rhai & SDK
- [ ] **Module loading**: Can scripts `import` external Rhai modules? (probably no for MVP)
- [ ] **File system access**: Can scripts read/write to local filesystem? (no for MVP)
- [ ] **Request/response sizes**: Max payload size? (set sensible default, e.g., 10MB)
### Operations
- [ ] **Container logs**: Capture executor stdout/stderr → attach to execution log? (yes, nice to have)
- [ ] **Script parsing errors**: Fail at upload time or runtime? (recommend: upload validation in Rhai)
- [ ] **Garbage collection**: How often to prune old execution logs? (optional MVP, monthly default)
### Future Integrations
- [ ] **Metrics backend**: Prometheus, InfluxDB, or local file?
- [ ] **Log aggregation**: ELK, Loki, or just local files?
- [ ] **Secrets backend**: Hashicorp Vault, local encrypted file, or built-in?
---
## 13. Success Metrics (MVP)
1. **Deployment ease**: Script uploaded and responding to HTTP in < 1 minute
2. **Performance**: p95 latency < 500ms (including container startup)
3. **Resource efficiency**: Server CPU/memory stays < 30% at rest, scales only on active requests
4. **Reliability**: 99.5% uptime, no memory leaks or orphaned containers
5. **Developer experience**: Dashboard feels responsive, errors are clear
---
## 14. Assumptions & Dependencies
**Assumptions:**
- Single server, modest hardware (2GB+ RAM, dual-core CPU)
- Rhai is mature enough for MVP (checked v1.12+)
- Docker daemon available on target machine
- PostgreSQL can be containerized (not separate managed service)
**Dependencies:**
- Docker (for executor runtime)
- Rust 1.70+ (for Orchestrator build)
- Rhai crate (script execution)
- Axum crate (HTTP framework)
- PostgreSQL client library (sqlx or tokio-postgres)
- Alpine Linux (executor base image)
---
## 16. Next Steps
1. **Clarify any ambiguities** in this blueprint
2. **Spike: Rhai executor image** — build minimal Alpine + Rhai image, test startup time
3. **Spike: Axum API** — scaffold REST endpoints for script CRUD
4. **Spike: PostgreSQL schema** — finalize schema, migrations
5. **Build Phase 1**: Orchestrator → Dashboard → Executor → docker-compose integration
---
## Document Control
| Version | Date | Author | Notes |
|---------|------|--------|-------|
| 1.0 | 2026-04-10 | Blueprint | MVP scope, architecture, tech stack locked |