Files

MechaCat02 e42d8a92a1 feat(e2e): Playwright suite — 134 tests across 9 spec areas + UA matrix

Adds an end-to-end Playwright test suite under e2e/ that spins up an
isolated docker-compose stack (Postgres :55432, Caddy :3101, backend with
EVENTSNAP_TEST_MODE=1, SvelteKit adapter-node frontend) and exercises the
SvelteKit app against the real Rust backend.

Phase 1 — happy paths covering every documented USER_JOURNEYS.md flow:
  01-auth/      join, recover, admin login, leave event, PIN lockout
  02-upload/    gallery picker (API path), rate-limit + admin toggle
  03-feed/      like/comment SSE, filters, SSE reconnect on visibility
  04-host/      event lock API, ban/unban, promote
  05-admin/     config validation, foundational authz guards, stats
  06-export/    /export status + download stub
  __smoke/      cross-UA happy-path (runs on every UA project)

Phase 2 — adversarial + browser chaos:
  07-adversarial/  XSS payloads (6 × display name path), SQLi shapes,
                   length / encoding / RTL override / NUL byte;
                   file-upload boundaries (ELF body claimed as JPEG,
                   oversize vs max_image_size_mb, zero-byte, NUL
                   filename, path-traversal, SVG-with-script);
                   JWT alg:none, signature/payload tamper, expired
                   session, PIN brute-force (serial + parallel),
                   admin password brute-force; deep authz (cross-user
                   delete, banned user across like/comment/feed-read,
                   host→admin escalation); small-scale DDoS (20× /join,
                   10MB comment body, 10 concurrent SSE).
  08-browser-chaos/ localStorage / sessionStorage / cookie purge,
                    IndexedDB drop mid-session, offline → reconnect,
                    slow-3G, 503 flakes, 429 with no retry storm,
                    multi-tab same/different user, no-JS, hostile CSS,
                    clock skew ±1h / -2d, localStorage quota exhausted.

Phase 3 — mobile gestures (runs only on chromium-mobile / Pixel 7):
  09-mobile/    touch-target ≥44px audit, env(safe-area-inset-bottom)
                structural check, long-press (FeedListCard → ContextSheet,
                quick-tap negation, click-suppression), double-tap
                (feed card like + lightbox heart-burst, via synthetic
                pointer events to bypass the first-tap-fires-click trap),
                viewport reflow (portrait/landscape/narrow/phablet),
                plus fixme stubs documenting planned gestures (swipe
                lightbox L/R, swipe-down dismiss, pull-to-refresh,
                long-press-comment).

Cross-UA matrix (chromium-engine projects run @smoke only):
  chromium-pixel7, chromium-galaxy-s22, samsung-internet (Samsung UA
  emulation on Galaxy viewport), edge-android, plus webkit-iphone,
  chrome-ios, firefox-android, firefox-desktop — the latter four need
  libavif16 on the host (Playwright dep) but the configs are in place.

Infrastructure:
  - fixtures/test.ts central test.extend (api, db, adminToken, guest,
    host, signIn). Per-test DB truncate via the dev-only POST
    /admin/__truncate route, gated by EVENTSNAP_TEST_MODE=1.
  - helpers/sse-listener.ts, helpers/upload-client.ts (Node-side
    multipart for adversarial file-upload tests + JPEG/PNG/ELF magic
    constants), helpers/touch.ts (longPress / doubleTap / swipe /
    inlineStyle / computedStyle).
  - 10 page objects covering every route + UploadSheet/Lightbox.
  - global-setup waits for /health, logs in admin, disables every
    rate-limit and quota toggle.
  - .github/workflows/e2e.yml: PR check runs chromium-desktop + the
    smoke matrix in parallel, uploads playwright-report/ and traces on
    failure.

Findings the suite surfaces as live `[finding]` warnings (not silenced):
  1. /admin/login has no rate-limit or lockout (bcrypt cost only).
  2. PIN-attempt counter races under parallel /recover requests.
  3. Zero-byte uploads pass /api/v1/upload.
  4. SVG-with-script can pass the magic-byte check (consider CSP +
     X-Content-Type-Options on /media/*).

Stack-internal docs live in e2e/README.md (UA tier table, Samsung
Internet escalation tiers A/B/C, debugging tips, roadmap).

Final tally: 134 passed / 0 failed / 9 skipped (test.fixme stubs for
not-yet-shipped gestures and one UI-upload-flow investigation).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-16 19:37:11 +02:00

15 KiB

Raw Permalink Blame History

EventSnap E2E Suite

Playwright-driven end-to-end tests for the EventSnap stack. The suite spins up an isolated docker-compose stack on ports :3101 (Caddy → frontend + backend) and :55432 (Postgres), and exercises the SvelteKit frontend against a real Rust backend with rate limits and quotas disabled.

Phases 1, 2, and 3-mobile-gestures are landed:

Phase 1 — happy-path coverage of every documented user journey, plus a smoke matrix across nine browser/UA profiles to catch engine-level divergences.
Phase 2 — adversarial inputs (XSS, SQL-injection, JWT forgery, MIME spoofing, oversize, brute-force) and browser chaos (storage purge, offline/slow-3G, multi-tab, clock skew, no-JS, quota exhaustion).
Phase 3 (gestures only) — touch-target audit, safe-area structural check, long-press → ContextSheet, double-tap → like, viewport reflow, plus test.fixme stubs for planned gestures (lightbox swipe, swipe-down dismiss, pull-to-refresh).

Phase 3 real-device compat (Android emulator + Samsung Internet via connectOverCDP, BrowserStack), visual regression, and a11y audits are sketched in the Roadmap at the bottom.

Quickstart

cd e2e
npm install
npm run install:browsers      # one-time: ~500 MB across chromium/firefox/webkit

# 1. Boot the test stack (rebuilds backend + frontend Docker images)
npm run stack:up

# 2. Wait ~20s for migrations + warmup, then run tests
npm run test:e2e              # full Phase 1 suite on chromium-desktop
npm run test:e2e:smoke        # cross-UA smoke matrix (~9 projects × 1 test)
npm run test:e2e:ui           # interactive Playwright UI mode

# 3. After: tear the stack down (deletes volumes)
npm run stack:down

The CI workflow at .github/workflows/e2e.yml runs both jobs on every PR.

What's tested

Every spec covers a journey from docs/USER_JOURNEYS.md or a security/chaos scenario. One folder per area:

Folder	Phase	Journeys / Topic	Tests	Notes
`specs/01-auth/`	1	§1, §2, §3, §11, §15	13	Join, recover, PIN lockout, admin login, leave event.
`specs/02-upload/`	1	§5, §6, §18	5	Gallery picker, multi-file, rate-limit, admin toggle.
`specs/03-feed/`	1	§7, §8, §17	5	Like/comment SSE, filter chips, SSE reconnect.
`specs/04-host/`	1	§9	5	Event lock, ban/unban, role change.
`specs/05-admin/`	1	§11, §16	11	Config validation, foundational auth guards, stats.
`specs/06-export/`	1	§12	3	Status, release, download stub.
`specs/__smoke/`	1	(matrix)	1 × 9 UAs	`@smoke`-tagged happy-path on every UA project.
`specs/07-adversarial/`	2	Input attacks, file upload boundaries, JWT forgery, brute-force, deep authorization, small DDoS	~40	See breakdown below.
`specs/08-browser-chaos/`	2	Storage purge, IndexedDB, offline/slow-3G, multi-tab, no-JS, clock skew, quota	~20	See breakdown below.
`specs/09-mobile/`	3	Touch-target audit, safe-area, long-press, double-tap, viewport reflow, fixme stubs	23	Runs only on `chromium-mobile` (Pixel 7 viewport). See below.

Phase 2 — adversarial (`specs/07-adversarial/`)

xss-injection.spec.ts — 13 tests. Six XSS payloads × display-name path
- four SQLi patterns + length/encoding edge cases (NUL byte, RTL override, caption overflow). Asserts window.__xssFired never gets set and no dialog event fires.
ui-rendering.spec.ts — 2 tests. Belt-and-braces: even when a script- payload sits in localStorage as the user's display name, rendering through /account keeps it as text.
file-upload-attacks.spec.ts — 9 tests. ELF body claimed as JPEG, oversize image vs max_image_size_mb, zero-byte, missing file field, path-traversal filename, NUL filename, application/* declared category bypass, SVG-with-script.
auth-tampering.spec.ts — 8 tests. alg:none forging admin role, signature tamper, payload tamper with original signature, logged-out session reuse, header without Bearer , missing Authorization, PIN brute-force lockout, admin password brute-force (documented finding — no lockout today, bcrypt cost is the only defense).
authorization-deep.spec.ts — 6 tests. Cross-user comment delete, banned user across like/comment/feed-read, host→admin escalation attempts.
ddos.spec.ts — 4 small-scale abuse tests. 20 parallel /join, 10 MB comment body, 10 concurrent SSE streams, malformed JSON.

Phase 2 — browser chaos (`specs/08-browser-chaos/`)

storage-purge.spec.ts — 5 tests. localStorage.clear() mid-session, cookies cleared (JWT in localStorage still works), sessionStorage cleared, admin force-relogin, PIN intentionally survives clearAuth.
indexeddb.spec.ts — 2 tests. Drop all IDB databases mid-session; stub IDB to undefined before navigation.
offline-network.spec.ts — 4 tests. setOffline(true) → reconnect, slow-3G via page.route delay, intermittent 503s, 429 from server (no infinite retry storm).
multi-tab.spec.ts — 3 tests. Same user two tabs, two users two contexts (storage isolated), logout in tab A doesn't sync to tab B (documented gap).
environment.spec.ts — 5 tests. JS disabled, localStorage quota exhausted, hostile CSS hiding nav, clock skew ±1h / -2d.

Pending tests covering features that need a Node-side multipart upload helper are marked test.fixme and will activate when that helper lands.

Browser & UA matrix

Project	Engine	UA / Device	Why
`chromium-desktop`	Chromium	Desktop Chrome	Baseline. Full suite runs here.
`chromium-pixel7`	Chromium	Pixel 7 device descriptor	Chrome Android.
`chromium-galaxy-s22`	Chromium	Galaxy viewport + Samsung phone UA	Chrome on Samsung hardware.
`samsung-internet`	Chromium	Galaxy viewport + SamsungBrowser UA	Tier-A Samsung Internet baseline.
`edge-android`	Chromium	Pixel viewport + EdgA UA	Edge Mobile (Blink-based).
`chrome-ios`	Chromium	iPhone viewport + CriOS UA	Chrome iOS (actually WebKit, but UA differs).
`webkit-iphone`	WebKit	iPhone 14 Pro	Real iOS Safari engine.
`firefox-android`	Firefox	Pixel viewport + Firefox Android UA	Gecko engine.
`firefox-desktop`	Firefox	Desktop Firefox	FF-specific quirks.

Only the @smoke happy-path runs across all projects (controlled by grep in playwright.config.ts). The full Phase 1 suite is chromium-desktop-only by default to keep CI under 15 min.

Samsung Internet — three escalation tiers

Samsung Internet ships on every Galaxy phone (~5% of mobile traffic in DE). It's Blink-based, so Tier-A catches ~90% of regressions. Real Samsung divergences (Smart Switch save-data mode, dark-mode injection, custom autoplay, in-browser ad blocking) are only reproducible at Tier B+:

Tier A (this repo, free, in CI): Playwright Chromium with the Samsung Internet user-agent + Galaxy viewport. See the samsung-internet project in playwright.config.ts.
Tier B (free, manual, future): Android Studio emulator on Linux → install Samsung Internet APK → enable --remote-debugging-port=9222 → chromium.connectOverCDP('http://localhost:9222'). Setup docs live in docs/samsung-emulator.md (to be written).
Tier C (paid, optional): BrowserStack or LambdaTest cloud devices. Real Galaxy S22/S23 hardware via Playwright's cloud integration.

Test isolation

Every test runs against a freshly truncated database:

global-setup.ts waits for /health, logs in admin, and disables every rate-limit and quota toggle via PATCH /admin/config.
The auto-fixture truncate in fixtures/test.ts calls POST /api/v1/admin/__truncate before every test.
The truncate endpoint is only registered when the backend is started with EVENTSNAP_TEST_MODE=1 (see backend/src/main.rs and backend/src/handlers/test_admin.rs). Production builds return 404.

Single-worker by design (workers: 1 in the config). Per-worker isolated DBs are a Phase-2+ change.

Architecture

e2e/
├── docker-compose.test.yml   # Isolated test stack: db :55432, caddy :3101
├── Caddyfile.test            # Proxies /api/* /media/* /health to backend
├── playwright.config.ts      # UA matrix + smoke grep
├── global-setup.ts           # admin login, rate-limit disable
├── global-teardown.ts        # (no-op; use `npm run stack:down`)
├── fixtures/
│   ├── api-client.ts         # Typed wrapper over /api/v1/*
│   ├── db.ts                 # Direct Postgres escape hatch (locked-PIN, etc.)
│   ├── test.ts               # Central test.extend (guest, host, signIn fixtures)
│   └── media/                # sample.jpg, sample.mp4, not-an-image.jpg
├── helpers/
│   ├── sse-listener.ts       # Async SSE iterator with waitForEvent()
│   ├── storage-helpers.ts    # localStorage/sessionStorage helpers
│   └── fake-media.ts         # Camera permissions (Chromium only)
├── page-objects/
│   ├── join-page.ts          # /join
│   ├── recover-page.ts       # /recover
│   ├── admin-login-page.ts   # /admin/login
│   ├── feed-page.ts          # /feed + bottom nav
│   ├── upload-sheet.ts       # UploadSheet.svelte + /upload
│   ├── lightbox.ts           # LightboxModal.svelte
│   ├── account-page.ts       # /account
│   ├── host-dashboard.ts     # /host
│   ├── admin-dashboard.ts    # /admin
│   └── export-page.ts        # /export
└── specs/
    ├── __smoke/              # @smoke cross-UA matrix (1 spec)
    ├── 01-auth/
    ├── 02-upload/
    ├── 03-feed/
    ├── 04-host/
    ├── 05-admin/
    └── 06-export/

Debugging a failure

npm run test:e2e:ui — interactive UI with time-travel and selector probe.
npm run test:e2e:headed — watch the browser run live.
npm run test:e2e:debug — Playwright inspector with breakpoints.
npm run stack:logs — tail backend + Postgres logs during a failure.
playwright-report/index.html — opens the HTML report (auto-generated on every run).
Trace files (test-results/**/trace.zip) drag-and-drop into https://trace.playwright.dev.

Conventions

One assertion per expect. Bundling multiple expects in one statement loses the line-level failure context.
Wait on data, not time. Use expect.poll for DB checks; never waitForTimeout in production specs.
@smoke tag on each suite's happiest path so the matrix run stays under 2 min.
test.fixme for features that need infrastructure not yet built (Node-side multipart upload helper, real video fixtures, etc.). Fixme tests don't fail the suite but show up in the report.
Page objects own selectors. Specs never use raw locators.
German text in assertions is fine — it's not going to change frequently. When it does, the page object is the only file to update.

Roadmap

Phase 2 — Adversarial & browser chaos ✅ landed

See the What's tested table above and the per-file breakdown. Known findings surfaced (documented in tests, not silent failures):

/admin/login has no rate-limit or lockout — bcrypt cost is the only defense.
localStorage 'storage' event is not listened for, so logout in tab A doesn't synchronously sign out tab B (the next 401 from any API call clears it).
SVG uploads currently pass the magic-byte check (depends on infer's detection coverage) — consider adding X-Content-Type-Options: nosniff
- CSP on /media/* if SVGs are ever expected as user content.

Phase 3 — Mobile gestures (`specs/09-mobile/`) ✅ landed

Runs only on the chromium-mobile project (Pixel 7 device descriptor with hasTouch and isMobile). The chromium-desktop project explicitly ignores this folder via testIgnore in playwright.config.ts.

touch-targets.spec.ts — 4 tests. Audits ≥ 44×44 px on bottom nav, FAB, join submit, admin-login submit, PIN-modal buttons. Uses expect.soft so a single failure surfaces the actual bounding-box dimensions instead of stopping the suite.
safe-area.spec.ts — 4 tests. Asserts env(safe-area-inset-bottom) is present in the inline style of every bottom-anchored UI element (bottom nav, UploadSheet, ContextSheet), and that the nav stays flush with the viewport bottom on a no-notch emulated device.
gestures-longpress.spec.ts — 3 tests. A 600 ms hold on a FeedListCard opens the ContextSheet; a 200 ms tap does not; the click-suppression logic prevents the lightbox from also opening at pointer-up. Driven via page.mouse.down/up because the longpress action listens for pointer events (mouse/touch/pen unified).
gestures-doubletap.spec.ts — 2 tests. Double-tap on a feed card image button records a like; double-tap inside the lightbox triggers the heart-burst animation and records a like. Assertions read the like count back via /api/v1/feed so they don't couple to specific badge markup.
viewport-reflow.spec.ts — 5 tests. Portrait, landscape, narrow (320×568), phablet (480×1024) — each asserts the bottom nav is visible, the FAB stays roughly centered, and there's no horizontal overflow on <html>. Plus a rotation test that confirms auth survives a viewport resize.
planned-gestures.spec.ts — 5 test.fixme stubs documenting the contracts for gestures from journey §17 that aren't shipped yet (lightbox swipe L/R, swipe-down to dismiss UploadSheet, pull-to-refresh, long-press on a comment). Flip test.fixme to test when wiring each gesture.

Driving gestures: the `helpers/touch.ts` module

longPress(page, locator, durationMs) — holds the pointer down for the duration. Default 600 ms beats the action's 500 ms threshold.
doubleTap(page, locator) — two mouse.down/up pairs within the doubletap action's 300 ms window.
swipe(page, from, to, steps) — gradual mouse-driven move (used by the fixme stubs once swipe gestures land).
inlineStyle(locator) / computedStyle(locator, prop) — read raw style attributes (where env(...) strings live) and computed values.

Long-press own/other post, swipe lightbox L/R, swipe-down dismiss, pull-to-refresh, double-tap like.
Safe-area inset visual diff on iPhone notch.
Touch-target ≥ 44 px audit.
Tier B Samsung Internet via connectOverCDP on Android Studio emulator.
Tier C BrowserStack integration (paid, optional).
@axe-core/playwright accessibility audits.
Visual regression with screenshot diffs.

Out of scope (handed to other tools)

Load testing → k6 / Vegeta.
API contract testing → backend cargo test integration tests.
Static asset auditing → Lighthouse CI.

15 KiB Raw Permalink Blame History Unescape Escape