Files
PiCloud/docs/versioning.md
MechaCat02 bb88b024d2 docs(versioning): post-1.0 policy with expansion-phase carve-out
Rewrites the "When to bump what" section now that the project is
post-1.0. Replaces the pre-1.0 framing with three explicit rules:

  - Major: surface major bump on a user-facing contract
  - Minor: phase milestone or coherent capability cluster, aligned
    with blueprint Phase boundaries (Phase 5 -> v1.2, etc.)
  - Patch: bug fixes AND additive-only surface changes

The carve-out (patch for additive surface changes) resolves the
tension with the v1.1.x roadmap: every v1.1.x release adds SDK or
schema surface, and strict "minor product bump per minor surface
bump" would inflate the version faster than the user-perceived
"platform changed" milestones warrant.

Examples updated to reflect post-1.0 numbers and the new policy:
adding KV in v1.1.1 (patch), cutting v1.2 as a phase milestone
(minor), renaming a ctx field (major).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 20:41:48 +02:00

158 lines
9.8 KiB
Markdown

# Versioning
PiCloud carries **one product version** for the build you install, and **independent versions on the four contracts that actually break for users**. The product version answers "which build do I have"; surface versions answer "which contracts does that build honor".
This split exists because crate-level SemVer between, say, `picloud-shared` and `picloud-manager-core` is fiction — they always ship together. The boundaries that matter are user-facing: scripts depending on the SDK, callers hitting the HTTP API, databases shared across deploys, and (later) executor nodes talking to a manager.
---
## What gets a version
### Lockstep — one number for the whole thing
All of these carry the same version and are bumped together:
- Every crate in the Cargo workspace (via `version.workspace = true`)
- The dashboard's `package.json`
- Docker image tags (`picloud:1.1.0`)
- Git tags (`v1.1.0`)
Defined once in [`Cargo.toml`](../Cargo.toml) under `[workspace.package]`. There is no scenario where one crate is at a different version than another in the same build.
### Independent — versioned at each surface
| Surface | Where the version lives | Format | Bump rule |
|---|---|---|---|
| **Rhai SDK** | [`shared::version::SDK_VERSION`](../crates/shared/src/version.rs), exposed to scripts as `ctx.sdk_version` | `"major.minor"` string | Minor: additions; Major: removals/renames/retyped |
| **HTTP API** | URL prefix `/api/v{N}/...`; `shared::version::API_VERSION` is the current major | integer | New integer when request/response shape, status semantics, or auth model changes |
| **Database schema** | Largest applied migration ID (`manager-core::migrations::latest_version()`) | integer, monotonic | One per forward migration; never edit a committed file |
| **Inter-service wire** (cluster mode, v1.3+) | `X-PiCloud-Wire` request header; `shared::version::WIRE_VERSION` | integer | New integer when RPC shape changes |
All five live in one place so `/version` can return them honestly.
---
## Per-surface compatibility rules
### Rhai SDK (strictest)
Scripts run in production with no recompile. A wrong SDK bump silently breaks user code.
- **Patch** (`1.2.0 → 1.2.1`) — doc fixes, internal optimizations. No script-observable change.
- **Minor** (`1.2 → 1.3`) — added functions; added optional `ctx.*` fields; relaxed limits; new variants accepted alongside old ones. **Every script written for 1.2 must still run unchanged on 1.3.**
- **Major** (`1 → 2`) — anything removed, renamed, retyped, restricted, or made required.
Scripts can detect available features at runtime:
```rhai
if ctx.sdk_version >= "1.2" {
// call kv.* (added in 1.2)
}
```
The contract test in `crates/executor-core/tests/sdk_contract/` (coming alongside the first SDK additions) holds golden scripts that exercise every documented SDK surface. They must pass on every commit. A minor bump that breaks any of them is a build failure.
### HTTP API
Path prefix is the version. **Within a major**, the following are non-breaking and welcome:
- New endpoints
- New optional request fields
- New response fields (clients must ignore unknown fields)
- New `Deprecation:` headers warning of upcoming removals
The following require a new major (`/api/v2/...`):
- Removed endpoints, removed response fields, renamed fields
- Changed request-field types or required-field additions
- Changed status-code semantics for the same outcome
- Auth model changes
When `vN+1` ships, `vN` stays live for **at least one product minor** (so users have a release cycle to migrate). Deprecation is announced via the `Deprecation: true` and `Sunset: <date>` response headers on the old prefix before removal.
### Database schema
- **Forward-only.** Never edit a migration that has shipped. If a migration was wrong, write a new one that fixes it.
- Migrations are numbered sequentially (`0001_init.sql`, `0002_*.sql`, ...). The number is the schema version.
- A given binary applies migrations strictly greater than the last-applied ID, then refuses to start if its embedded migrations are *older* than what's in the DB — that would imply a downgrade, which is never automatic.
- This makes rolling deploys safe: the schema is always "ahead of or equal to" any running binary in the cluster.
### Wire protocol (cluster mode, v1.3+)
- Inter-service RPCs include `X-PiCloud-Wire: N`.
- A peer that doesn't recognize `N` refuses the call and returns `426 Upgrade Required` with the version it speaks.
- Both versions must be live in the cluster during rolling upgrades — current and current-minus-one — until all nodes agree on the new one.
---
## How we check and enforce
A versioning scheme without enforcement decays in months. Five cheap mechanical checks:
1. **Compile-time uniformity.** All workspace crates inherit `version.workspace = true`. Drift is impossible to introduce.
2. **Runtime self-report.** `GET /version` returns every surface version. Dashboards, monitoring, inter-service handshakes, and humans all read from one source. `/healthz` stays a plain `"ok"` string for k8s probes — version negotiation is a separate concern.
3. **Golden SDK contract tests.** `tests/sdk_contract/` Rhai scripts exercise every SDK surface and must pass on every commit. The contract is the test.
4. **Migration replay test.** An integration test that boots a fresh Postgres, applies every migration in order, and asserts the resulting schema. Catches the most common mistake (edited-not-added migration).
5. **CI guardrail script.** [`scripts/check-versioning.sh`](../scripts/check-versioning.sh) — runs the structural checks that don't need git history:
- Migration files are numbered sequentially from `0001_*.sql` with no gaps.
- `SDK_VERSION` parses as `MAJOR.MINOR` (numeric, no extra components).
- `[workspace.package].version` parses as `MAJOR.MINOR.PATCH`.
Run manually as `bash scripts/check-versioning.sh`. Wires into CI when CI exists. Deferred to the same future PR that introduces CI: SDK-major-bump-needs-CHANGELOG and `BREAKING:` commit-message annotation (both need git history + a CHANGELOG file that doesn't exist yet).
(3) and (4) are now in place: [`crates/executor-core/tests/sdk_contract.rs`](../crates/executor-core/tests/sdk_contract.rs) holds the SDK contract suite; [`crates/manager-core/tests/schema_snapshot.rs`](../crates/manager-core/tests/schema_snapshot.rs) holds the schema snapshot guard.
---
## When to bump what
The product version uses SemVer with one carve-out for the platform's expansion cadence:
- **Major** (`1.x → 2.0`) — surface major bump on a user-facing contract: removed/renamed/retyped SDK function, retired API version, breaking schema change that requires user action, breaking wire-protocol change.
- **Minor** (`1.1 → 1.2`) — phase milestone or coherent capability cluster. Bumped when the maintainer marks a release as "the platform moved forward in a way that warrants a number". Typically aligned with blueprint Phase boundaries (Phase 5 → v1.2, Phase 6 → v1.3+).
- **Patch** (`1.1.0 → 1.1.1`) — everything else: bug fixes AND **additive-only surface changes**. New SDK function, new admin endpoint, new schema migration that only adds tables/columns, new env var, new trigger kind — all patch.
**Why the carve-out:** PiCloud ships in many small additive PRs (every v1.1.x release adds SDK surface). A strict "minor product bump per minor surface bump" rule would inflate the product version faster than the actual user-perceived "platform changed" milestones warrant. Patch-for-additions keeps the minor digit aligned with capability clusters, not individual feature drops.
**Surface versions follow their own rules** (table above) and don't track the product version. A surface can independently hit its own `1.0` or `2.0`. The SDK in particular is likely to stabilize before the platform does, since scripts in production demand it.
---
## Current versions
| | Version |
|---|---|
| Product | `1.1.0` |
| SDK | `1.1` (adds `ctx.request.params`, `ctx.request.query`, `ctx.request.rest`) |
| API | `1` (additive: `Script.app_id`, `Route.app_id`, `ExecutionLog.app_id`, new `/api/v1/admin/apps/*` and `/api/v1/admin/api-keys/*` endpoints, `?app=` filter on script list, `Authorization: Bearer pic_…` credential type, 403 responses on previously-401-only admin endpoints when the caller lacks the required capability) |
| Schema | `6` (matches `migrations/0006_users_authz.sql`) |
| Wire | `1` (reserved; cluster mode not implemented) |
Read live from `GET /version` on any running instance.
---
## Examples
**Adding a `kv.*` SDK in v1.1.1:**
- Workspace bump: `1.1.0 → 1.1.1` (patch — additive SDK + schema, no breakage)
- SDK bump: `"1.1" → "1.2"` (added functions only)
- API bump: none (admin endpoints for trigger CRUD are additive)
- Schema bump: `6 → 7` (`0007_kv_store.sql` adds the `kv_store` table)
**Cutting the v1.2 release (Phase 5: workflows, advanced query, interceptors):**
- Workspace bump: `1.1.8 → 1.2.0` (minor — phase milestone)
- Even if no individual change is breaking, the maintainer-marked phase transition warrants the minor digit.
**Renaming `ctx.execution_id` to `ctx.exec_id`:**
- SDK bump: `"1.x" → "2.0"` (breaking — removed/retyped script-visible field)
- Workspace bump: `1.x.y → 2.0.0` (product major — user-facing contract break)
- Migration path: keep `ctx.execution_id` available in 1.x for a deprecation window, add `ctx.exec_id` alongside; flip to 2.0 only when both fields have shipped together for a release.
**Adding pagination to `GET /api/v1/admin/scripts`:**
- New optional `?limit=&offset=` query params with sensible defaults → no API bump
- Response keeps the same shape; clients that don't pass `limit` see the old behavior → no API bump
**Changing the response shape of `GET /api/v1/admin/scripts/{id}` to wrap in `{ script: {...} }`:**
- Breaking. Ship as `/api/v2/admin/scripts/{id}`. Keep `/api/v1` live until at least one product minor passes.