Files
PiCloud/docs/versioning.md
MechaCat02 bb88b024d2 docs(versioning): post-1.0 policy with expansion-phase carve-out
Rewrites the "When to bump what" section now that the project is
post-1.0. Replaces the pre-1.0 framing with three explicit rules:

  - Major: surface major bump on a user-facing contract
  - Minor: phase milestone or coherent capability cluster, aligned
    with blueprint Phase boundaries (Phase 5 -> v1.2, etc.)
  - Patch: bug fixes AND additive-only surface changes

The carve-out (patch for additive surface changes) resolves the
tension with the v1.1.x roadmap: every v1.1.x release adds SDK or
schema surface, and strict "minor product bump per minor surface
bump" would inflate the version faster than the user-perceived
"platform changed" milestones warrant.

Examples updated to reflect post-1.0 numbers and the new policy:
adding KV in v1.1.1 (patch), cutting v1.2 as a phase milestone
(minor), renaming a ctx field (major).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 20:41:48 +02:00

9.8 KiB

Versioning

PiCloud carries one product version for the build you install, and independent versions on the four contracts that actually break for users. The product version answers "which build do I have"; surface versions answer "which contracts does that build honor".

This split exists because crate-level SemVer between, say, picloud-shared and picloud-manager-core is fiction — they always ship together. The boundaries that matter are user-facing: scripts depending on the SDK, callers hitting the HTTP API, databases shared across deploys, and (later) executor nodes talking to a manager.


What gets a version

Lockstep — one number for the whole thing

All of these carry the same version and are bumped together:

  • Every crate in the Cargo workspace (via version.workspace = true)
  • The dashboard's package.json
  • Docker image tags (picloud:1.1.0)
  • Git tags (v1.1.0)

Defined once in Cargo.toml under [workspace.package]. There is no scenario where one crate is at a different version than another in the same build.

Independent — versioned at each surface

Surface Where the version lives Format Bump rule
Rhai SDK shared::version::SDK_VERSION, exposed to scripts as ctx.sdk_version "major.minor" string Minor: additions; Major: removals/renames/retyped
HTTP API URL prefix /api/v{N}/...; shared::version::API_VERSION is the current major integer New integer when request/response shape, status semantics, or auth model changes
Database schema Largest applied migration ID (manager-core::migrations::latest_version()) integer, monotonic One per forward migration; never edit a committed file
Inter-service wire (cluster mode, v1.3+) X-PiCloud-Wire request header; shared::version::WIRE_VERSION integer New integer when RPC shape changes

All five live in one place so /version can return them honestly.


Per-surface compatibility rules

Rhai SDK (strictest)

Scripts run in production with no recompile. A wrong SDK bump silently breaks user code.

  • Patch (1.2.0 → 1.2.1) — doc fixes, internal optimizations. No script-observable change.
  • Minor (1.2 → 1.3) — added functions; added optional ctx.* fields; relaxed limits; new variants accepted alongside old ones. Every script written for 1.2 must still run unchanged on 1.3.
  • Major (1 → 2) — anything removed, renamed, retyped, restricted, or made required.

Scripts can detect available features at runtime:

if ctx.sdk_version >= "1.2" {
    // call kv.* (added in 1.2)
}

The contract test in crates/executor-core/tests/sdk_contract/ (coming alongside the first SDK additions) holds golden scripts that exercise every documented SDK surface. They must pass on every commit. A minor bump that breaks any of them is a build failure.

HTTP API

Path prefix is the version. Within a major, the following are non-breaking and welcome:

  • New endpoints
  • New optional request fields
  • New response fields (clients must ignore unknown fields)
  • New Deprecation: headers warning of upcoming removals

The following require a new major (/api/v2/...):

  • Removed endpoints, removed response fields, renamed fields
  • Changed request-field types or required-field additions
  • Changed status-code semantics for the same outcome
  • Auth model changes

When vN+1 ships, vN stays live for at least one product minor (so users have a release cycle to migrate). Deprecation is announced via the Deprecation: true and Sunset: <date> response headers on the old prefix before removal.

Database schema

  • Forward-only. Never edit a migration that has shipped. If a migration was wrong, write a new one that fixes it.
  • Migrations are numbered sequentially (0001_init.sql, 0002_*.sql, ...). The number is the schema version.
  • A given binary applies migrations strictly greater than the last-applied ID, then refuses to start if its embedded migrations are older than what's in the DB — that would imply a downgrade, which is never automatic.
  • This makes rolling deploys safe: the schema is always "ahead of or equal to" any running binary in the cluster.

Wire protocol (cluster mode, v1.3+)

  • Inter-service RPCs include X-PiCloud-Wire: N.
  • A peer that doesn't recognize N refuses the call and returns 426 Upgrade Required with the version it speaks.
  • Both versions must be live in the cluster during rolling upgrades — current and current-minus-one — until all nodes agree on the new one.

How we check and enforce

A versioning scheme without enforcement decays in months. Five cheap mechanical checks:

  1. Compile-time uniformity. All workspace crates inherit version.workspace = true. Drift is impossible to introduce.

  2. Runtime self-report. GET /version returns every surface version. Dashboards, monitoring, inter-service handshakes, and humans all read from one source. /healthz stays a plain "ok" string for k8s probes — version negotiation is a separate concern.

  3. Golden SDK contract tests. tests/sdk_contract/ Rhai scripts exercise every SDK surface and must pass on every commit. The contract is the test.

  4. Migration replay test. An integration test that boots a fresh Postgres, applies every migration in order, and asserts the resulting schema. Catches the most common mistake (edited-not-added migration).

  5. CI guardrail script. scripts/check-versioning.sh — runs the structural checks that don't need git history:

    • Migration files are numbered sequentially from 0001_*.sql with no gaps.
    • SDK_VERSION parses as MAJOR.MINOR (numeric, no extra components).
    • [workspace.package].version parses as MAJOR.MINOR.PATCH.

    Run manually as bash scripts/check-versioning.sh. Wires into CI when CI exists. Deferred to the same future PR that introduces CI: SDK-major-bump-needs-CHANGELOG and BREAKING: commit-message annotation (both need git history + a CHANGELOG file that doesn't exist yet).

(3) and (4) are now in place: crates/executor-core/tests/sdk_contract.rs holds the SDK contract suite; crates/manager-core/tests/schema_snapshot.rs holds the schema snapshot guard.


When to bump what

The product version uses SemVer with one carve-out for the platform's expansion cadence:

  • Major (1.x → 2.0) — surface major bump on a user-facing contract: removed/renamed/retyped SDK function, retired API version, breaking schema change that requires user action, breaking wire-protocol change.
  • Minor (1.1 → 1.2) — phase milestone or coherent capability cluster. Bumped when the maintainer marks a release as "the platform moved forward in a way that warrants a number". Typically aligned with blueprint Phase boundaries (Phase 5 → v1.2, Phase 6 → v1.3+).
  • Patch (1.1.0 → 1.1.1) — everything else: bug fixes AND additive-only surface changes. New SDK function, new admin endpoint, new schema migration that only adds tables/columns, new env var, new trigger kind — all patch.

Why the carve-out: PiCloud ships in many small additive PRs (every v1.1.x release adds SDK surface). A strict "minor product bump per minor surface bump" rule would inflate the product version faster than the actual user-perceived "platform changed" milestones warrant. Patch-for-additions keeps the minor digit aligned with capability clusters, not individual feature drops.

Surface versions follow their own rules (table above) and don't track the product version. A surface can independently hit its own 1.0 or 2.0. The SDK in particular is likely to stabilize before the platform does, since scripts in production demand it.


Current versions

Version
Product 1.1.0
SDK 1.1 (adds ctx.request.params, ctx.request.query, ctx.request.rest)
API 1 (additive: Script.app_id, Route.app_id, ExecutionLog.app_id, new /api/v1/admin/apps/* and /api/v1/admin/api-keys/* endpoints, ?app= filter on script list, Authorization: Bearer pic_… credential type, 403 responses on previously-401-only admin endpoints when the caller lacks the required capability)
Schema 6 (matches migrations/0006_users_authz.sql)
Wire 1 (reserved; cluster mode not implemented)

Read live from GET /version on any running instance.


Examples

Adding a kv.* SDK in v1.1.1:

  • Workspace bump: 1.1.0 → 1.1.1 (patch — additive SDK + schema, no breakage)
  • SDK bump: "1.1" → "1.2" (added functions only)
  • API bump: none (admin endpoints for trigger CRUD are additive)
  • Schema bump: 6 → 7 (0007_kv_store.sql adds the kv_store table)

Cutting the v1.2 release (Phase 5: workflows, advanced query, interceptors):

  • Workspace bump: 1.1.8 → 1.2.0 (minor — phase milestone)
  • Even if no individual change is breaking, the maintainer-marked phase transition warrants the minor digit.

Renaming ctx.execution_id to ctx.exec_id:

  • SDK bump: "1.x" → "2.0" (breaking — removed/retyped script-visible field)
  • Workspace bump: 1.x.y → 2.0.0 (product major — user-facing contract break)
  • Migration path: keep ctx.execution_id available in 1.x for a deprecation window, add ctx.exec_id alongside; flip to 2.0 only when both fields have shipped together for a release.

Adding pagination to GET /api/v1/admin/scripts:

  • New optional ?limit=&offset= query params with sensible defaults → no API bump
  • Response keeps the same shape; clients that don't pass limit see the old behavior → no API bump

Changing the response shape of GET /api/v1/admin/scripts/{id} to wrap in { script: {...} }:

  • Breaking. Ship as /api/v2/admin/scripts/{id}. Keep /api/v1 live until at least one product minor passes.