CleanupRegistry's catch-all was masking every kind of teardown error,
not just the intended "resource already gone" 404. A backend returning
500 on delete would leak orphans run after run without ever surfacing.
Now treat 2xx and 404 as success, log any other status (and any
thrown network error) to stderr with the resource label, and keep
running the remaining items. The suite stays best-effort but no
longer hides accumulating leaks.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Date.now() can collide across workers running on the same millisecond
boundary. The worker-aware helper that the rest of the suite uses
side-steps that without changing the test's intent.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Array.reverse mutates in place — a defensive double-run() would have
re-reversed the items. Iterate over a copy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cancels once to assert the modal can be dismissed without side
effects, then confirms to flip the user to inactive, then reactivates
to assert that direction remains one-click.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lifts loginAsUserToken + pageWithUserToken out of members.spec.ts into
fixtures/role-page.ts (third file that needs them). Adds shadowing
coverage: viewer member sees no New-app / Add-domain / Settings / Save
/ +Add-route, editor sees Save but no Delete header, and CodeMirror
renders contenteditable=false for viewers.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wipes e2e-* apps and e2e* admin users before the suite starts so a
prior crashed run doesn't accumulate state across runs (45 rows
observed on 2026-05-28). Per-row try/catch keeps it best-effort; a
sweep failure never blocks the suite.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two scenarios that span the dashboard UI and the data/control plane
end-to-end:
- App + domain claim + script + route all created via the dashboard,
then the script is invoked through the public URL with the
matching Host header. Verifies the dashboard actions actually
reach the orchestrator's route trie.
- API key minted via the dashboard, then used as a bearer token
against /api/v1/admin/* (the CLI surface). Confirms the scope is
enforced (script:read passes /scripts, 403s /admins) and that
revoking via the dashboard immediately invalidates the token.
Also: the B7 copy-token test selected the mint-form Name input via
getByLabel('Name'), which became ambiguous once the integration
test created an app and the Binding dropdown was no longer empty.
Switched both B7 mint flows to placeholder-based selectors.
Suite: 57/57 passing in ~18s.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three issues found while running the full B1–B8 suite together:
- The B1 logout test was driving the shared admin storageState
token, invalidating it for every subsequent test. Switched it to
a fresh login so its session is disposable.
- Bumped navigationTimeout to 30s and capped local workers at 4 to
cope with the Vite dev server's first-compile cost under
parallel load. Local also gets one retry to absorb intermittent
warmup flakiness.
- Cleared a few lint warnings (unused appId / _adminPage vars) and
belt-and-braces gitignore for playwright artifacts written to
the repo root when the CLI is invoked from there by accident.
Suite now: 55/55 passing in ~21s.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five tests covering platform-wide guarantees: expired-token
redirect, HttpOnly session cookie, bootstrap password not leaked
into the DOM after login, missing-app slug fails gracefully, and
an XSS-sink probe across the main authed routes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Six tests covering /admin/profile: mint instance-wide key with the
reveal/ack flow, the app-binding mutual-exclusion guard (instance
scopes auto-disabled), revoke via the ConfirmModal, the
?denied=users banner, plus adversarial cases (empty-name button
disabled, copy-token writes the full token to the clipboard).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four tests covering the Members tab: invite + remove (action-menu +
phrase modal), role change, the non-app-admin viewer who never sees
the Members tab at all (cross-context via a second admin login),
and an adversarial that the role dropdown only exposes the
documented set of values.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Seven tests covering the Routing tab inside the script editor: add
+ list + remove (handling the window.confirm dialog), match-preview
round trip, path-kind mismatch warning, unclaimed-host warning,
duplicate-route 409, plus reserved-prefix rejection and a path-XSS
adversarial that checks no script tag escapes into the route list.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Seven tests covering script creation via the Scripts tab, the source
editor (CodeMirror typing + save + reload), Format-button error
surfaces for both Rhai and the test-invoke JSON body, the test-invoke
happy path, settings input validation, and an infinite-loop adversarial
that asserts the sandbox timeout reports cleanly and the editor stays
interactive.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Seven tests covering app CRUD via the dashboard: create with
slug auto-derive, settings rename, delete with phrase-confirmation
modal, historical-slug takeover via the create form, plus adversarial
inputs (slug normalization, XSS in name/description, oversized name).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Eight tests covering the login form, layout-level redirects, logout,
and the obvious adversarial inputs (XSS in username, empty submit,
password field type, leaked tokens). All targeted at /admin/login and
the bounce-back behaviors implemented in +layout.svelte.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Milestone A of the frontend test plan. Sets up the test rig — config,
globalSetup that probes the backend and seeds an admin session into
storageState, lightweight fixtures, and a 3-test smoke spec — without
yet covering any user journeys (those land in Milestone B).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>