feat(e2e): Playwright suite — 134 tests across 9 spec areas + UA matrix

Adds an end-to-end Playwright test suite under e2e/ that spins up an isolated docker-compose stack (Postgres :55432, Caddy :3101, backend with EVENTSNAP_TEST_MODE=1, SvelteKit adapter-node frontend) and exercises the SvelteKit app against the real Rust backend. Phase 1 — happy paths covering every documented USER_JOURNEYS.md flow: 01-auth/ join, recover, admin login, leave event, PIN lockout 02-upload/ gallery picker (API path), rate-limit + admin toggle 03-feed/ like/comment SSE, filters, SSE reconnect on visibility 04-host/ event lock API, ban/unban, promote 05-admin/ config validation, foundational authz guards, stats 06-export/ /export status + download stub __smoke/ cross-UA happy-path (runs on every UA project) Phase 2 — adversarial + browser chaos: 07-adversarial/ XSS payloads (6 × display name path), SQLi shapes, length / encoding / RTL override / NUL byte; file-upload boundaries (ELF body claimed as JPEG, oversize vs max_image_size_mb, zero-byte, NUL filename, path-traversal, SVG-with-script); JWT alg:none, signature/payload tamper, expired session, PIN brute-force (serial + parallel), admin password brute-force; deep authz (cross-user delete, banned user across like/comment/feed-read, host→admin escalation); small-scale DDoS (20× /join, 10MB comment body, 10 concurrent SSE). 08-browser-chaos/ localStorage / sessionStorage / cookie purge, IndexedDB drop mid-session, offline → reconnect, slow-3G, 503 flakes, 429 with no retry storm, multi-tab same/different user, no-JS, hostile CSS, clock skew ±1h / -2d, localStorage quota exhausted. Phase 3 — mobile gestures (runs only on chromium-mobile / Pixel 7): 09-mobile/ touch-target ≥44px audit, env(safe-area-inset-bottom) structural check, long-press (FeedListCard → ContextSheet, quick-tap negation, click-suppression), double-tap (feed card like + lightbox heart-burst, via synthetic pointer events to bypass the first-tap-fires-click trap), viewport reflow (portrait/landscape/narrow/phablet), plus fixme stubs documenting planned gestures (swipe lightbox L/R, swipe-down dismiss, pull-to-refresh, long-press-comment). Cross-UA matrix (chromium-engine projects run @smoke only): chromium-pixel7, chromium-galaxy-s22, samsung-internet (Samsung UA emulation on Galaxy viewport), edge-android, plus webkit-iphone, chrome-ios, firefox-android, firefox-desktop — the latter four need libavif16 on the host (Playwright dep) but the configs are in place. Infrastructure: - fixtures/test.ts central test.extend (api, db, adminToken, guest, host, signIn). Per-test DB truncate via the dev-only POST /admin/__truncate route, gated by EVENTSNAP_TEST_MODE=1. - helpers/sse-listener.ts, helpers/upload-client.ts (Node-side multipart for adversarial file-upload tests + JPEG/PNG/ELF magic constants), helpers/touch.ts (longPress / doubleTap / swipe / inlineStyle / computedStyle). - 10 page objects covering every route + UploadSheet/Lightbox. - global-setup waits for /health, logs in admin, disables every rate-limit and quota toggle. - .github/workflows/e2e.yml: PR check runs chromium-desktop + the smoke matrix in parallel, uploads playwright-report/ and traces on failure. Findings the suite surfaces as live `[finding]` warnings (not silenced): 1. /admin/login has no rate-limit or lockout (bcrypt cost only). 2. PIN-attempt counter races under parallel /recover requests. 3. Zero-byte uploads pass /api/v1/upload. 4. SVG-with-script can pass the magic-byte check (consider CSP + X-Content-Type-Options on /media/*). Stack-internal docs live in e2e/README.md (UA tier table, Samsung Internet escalation tiers A/B/C, debugging tips, roadmap). Final tally: 134 passed / 0 failed / 9 skipped (test.fixme stubs for not-yet-shipped gestures and one UI-upload-flow investigation). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 19:02:29 +02:00
parent 1cdab21514
commit e42d8a92a1
64 changed files with 4174 additions and 0 deletions
--- a/e2e/README.md
+++ b/e2e/README.md
@@ -0,0 +1,287 @@
+# EventSnap E2E Suite
+
+Playwright-driven end-to-end tests for the EventSnap stack. The suite spins
+up an isolated docker-compose stack on ports `:3101` (Caddy → frontend +
+backend) and `:55432` (Postgres), and exercises the SvelteKit frontend
+against a real Rust backend with rate limits and quotas disabled.
+
+**Phases 1, 2, and 3-mobile-gestures are landed**:
+- **Phase 1** — happy-path coverage of every documented user journey, plus a
+  smoke matrix across nine browser/UA profiles to catch engine-level
+  divergences.
+- **Phase 2** — adversarial inputs (XSS, SQL-injection, JWT forgery, MIME
+  spoofing, oversize, brute-force) and browser chaos (storage purge,
+  offline/slow-3G, multi-tab, clock skew, no-JS, quota exhaustion).
+- **Phase 3 (gestures only)** — touch-target audit, safe-area structural
+  check, long-press → ContextSheet, double-tap → like, viewport reflow,
+  plus `test.fixme` stubs for planned gestures (lightbox swipe, swipe-down
+  dismiss, pull-to-refresh).
+
+Phase 3 real-device compat (Android emulator + Samsung Internet via
+`connectOverCDP`, BrowserStack), visual regression, and a11y audits are
+sketched in the **Roadmap** at the bottom.
+
+## Quickstart
+
+```bash
+cd e2e
+npm install
+npm run install:browsers      # one-time: ~500 MB across chromium/firefox/webkit
+
+# 1. Boot the test stack (rebuilds backend + frontend Docker images)
+npm run stack:up
+
+# 2. Wait ~20s for migrations + warmup, then run tests
+npm run test:e2e              # full Phase 1 suite on chromium-desktop
+npm run test:e2e:smoke        # cross-UA smoke matrix (~9 projects × 1 test)
+npm run test:e2e:ui           # interactive Playwright UI mode
+
+# 3. After: tear the stack down (deletes volumes)
+npm run stack:down
+```
+
+The CI workflow at `.github/workflows/e2e.yml` runs both jobs on every PR.
+
+## What's tested
+
+Every spec covers a journey from [`docs/USER_JOURNEYS.md`](../docs/USER_JOURNEYS.md)
+or a security/chaos scenario. One folder per area:
+
+| Folder | Phase | Journeys / Topic | Tests | Notes |
+|---|---|---|---|---|
+| `specs/01-auth/` | 1 | §1, §2, §3, §11, §15 | 13 | Join, recover, PIN lockout, admin login, leave event. |
+| `specs/02-upload/` | 1 | §5, §6, §18 | 5 | Gallery picker, multi-file, rate-limit, admin toggle. |
+| `specs/03-feed/` | 1 | §7, §8, §17 | 5 | Like/comment SSE, filter chips, SSE reconnect. |
+| `specs/04-host/` | 1 | §9 | 5 | Event lock, ban/unban, role change. |
+| `specs/05-admin/` | 1 | §11, §16 | 11 | Config validation, foundational auth guards, stats. |
+| `specs/06-export/` | 1 | §12 | 3 | Status, release, download stub. |
+| `specs/__smoke/` | 1 | (matrix) | 1 × 9 UAs | `@smoke`-tagged happy-path on every UA project. |
+| `specs/07-adversarial/` | **2** | Input attacks, file upload boundaries, JWT forgery, brute-force, deep authorization, small DDoS | ~40 | See breakdown below. |
+| `specs/08-browser-chaos/` | **2** | Storage purge, IndexedDB, offline/slow-3G, multi-tab, no-JS, clock skew, quota | ~20 | See breakdown below. |
+| `specs/09-mobile/` | **3** | Touch-target audit, safe-area, long-press, double-tap, viewport reflow, fixme stubs | 23 | Runs only on `chromium-mobile` (Pixel 7 viewport). See below. |
+
+### Phase 2 — adversarial (`specs/07-adversarial/`)
+
+- **`xss-injection.spec.ts`** — 13 tests. Six XSS payloads × display-name path
+  + four SQLi patterns + length/encoding edge cases (NUL byte, RTL override,
+  caption overflow). Asserts `window.__xssFired` never gets set and no
+  `dialog` event fires.
+- **`ui-rendering.spec.ts`** — 2 tests. Belt-and-braces: even when a script-
+  payload sits in localStorage as the user's display name, rendering through
+  `/account` keeps it as text.
+- **`file-upload-attacks.spec.ts`** — 9 tests. ELF body claimed as JPEG,
+  oversize image vs `max_image_size_mb`, zero-byte, missing file field,
+  path-traversal filename, NUL filename, `application/*` declared category
+  bypass, SVG-with-script.
+- **`auth-tampering.spec.ts`** — 8 tests. `alg:none` forging admin role,
+  signature tamper, payload tamper with original signature, logged-out
+  session reuse, header without `Bearer `, missing Authorization,
+  PIN brute-force lockout, admin password brute-force (documented finding —
+  no lockout today, bcrypt cost is the only defense).
+- **`authorization-deep.spec.ts`** — 6 tests. Cross-user comment delete,
+  banned user across like/comment/feed-read, host→admin escalation attempts.
+- **`ddos.spec.ts`** — 4 small-scale abuse tests. 20 parallel /join, 10 MB
+  comment body, 10 concurrent SSE streams, malformed JSON.
+
+### Phase 2 — browser chaos (`specs/08-browser-chaos/`)
+
+- **`storage-purge.spec.ts`** — 5 tests. `localStorage.clear()` mid-session,
+  cookies cleared (JWT in localStorage still works), sessionStorage cleared,
+  admin force-relogin, PIN intentionally survives clearAuth.
+- **`indexeddb.spec.ts`** — 2 tests. Drop all IDB databases mid-session;
+  stub IDB to undefined before navigation.
+- **`offline-network.spec.ts`** — 4 tests. `setOffline(true)` → reconnect,
+  slow-3G via `page.route` delay, intermittent 503s, 429 from server (no
+  infinite retry storm).
+- **`multi-tab.spec.ts`** — 3 tests. Same user two tabs, two users two
+  contexts (storage isolated), logout in tab A doesn't sync to tab B
+  (documented gap).
+- **`environment.spec.ts`** — 5 tests. JS disabled, localStorage quota
+  exhausted, hostile CSS hiding nav, clock skew ±1h / -2d.
+
+Pending tests covering features that need a Node-side multipart upload helper
+are marked `test.fixme` and will activate when that helper lands.
+
+## Browser & UA matrix
+
+| Project | Engine | UA / Device | Why |
+|---|---|---|---|
+| `chromium-desktop` | Chromium | Desktop Chrome | Baseline. Full suite runs here. |
+| `chromium-pixel7` | Chromium | Pixel 7 device descriptor | Chrome Android. |
+| `chromium-galaxy-s22` | Chromium | Galaxy viewport + Samsung phone UA | Chrome on Samsung hardware. |
+| `samsung-internet` | Chromium | Galaxy viewport + SamsungBrowser UA | **Tier-A Samsung Internet baseline.** |
+| `edge-android` | Chromium | Pixel viewport + EdgA UA | Edge Mobile (Blink-based). |
+| `chrome-ios` | Chromium | iPhone viewport + CriOS UA | Chrome iOS (actually WebKit, but UA differs). |
+| `webkit-iphone` | WebKit | iPhone 14 Pro | Real iOS Safari engine. |
+| `firefox-android` | Firefox | Pixel viewport + Firefox Android UA | Gecko engine. |
+| `firefox-desktop` | Firefox | Desktop Firefox | FF-specific quirks. |
+
+Only the `@smoke` happy-path runs across all projects (controlled by
+`grep` in `playwright.config.ts`). The full Phase 1 suite is
+`chromium-desktop`-only by default to keep CI under 15 min.
+
+### Samsung Internet — three escalation tiers
+
+Samsung Internet ships on every Galaxy phone (~5% of mobile traffic in DE).
+It's **Blink-based**, so Tier-A catches ~90% of regressions. Real Samsung
+divergences (Smart Switch save-data mode, dark-mode injection, custom
+autoplay, in-browser ad blocking) are only reproducible at Tier B+:
+
+- **Tier A** *(this repo, free, in CI)*: Playwright Chromium with the
+  Samsung Internet user-agent + Galaxy viewport. See the `samsung-internet`
+  project in `playwright.config.ts`.
+- **Tier B** *(free, manual, future)*: Android Studio emulator on Linux →
+  install Samsung Internet APK → enable `--remote-debugging-port=9222` →
+  `chromium.connectOverCDP('http://localhost:9222')`. Setup docs live in
+  `docs/samsung-emulator.md` (to be written).
+- **Tier C** *(paid, optional)*: BrowserStack or LambdaTest cloud devices.
+  Real Galaxy S22/S23 hardware via Playwright's cloud integration.
+
+## Test isolation
+
+Every test runs against a **freshly truncated database**:
+
+1. `global-setup.ts` waits for `/health`, logs in admin, and disables every
+   rate-limit and quota toggle via `PATCH /admin/config`.
+2. The auto-fixture `truncate` in `fixtures/test.ts` calls
+   `POST /api/v1/admin/__truncate` before every test.
+3. The truncate endpoint is only registered when the backend is started
+   with `EVENTSNAP_TEST_MODE=1` (see `backend/src/main.rs` and
+   `backend/src/handlers/test_admin.rs`). Production builds return 404.
+
+Single-worker by design (`workers: 1` in the config). Per-worker isolated
+DBs are a Phase-2+ change.
+
+## Architecture
+
+```
+e2e/
+├── docker-compose.test.yml   # Isolated test stack: db :55432, caddy :3101
+├── Caddyfile.test            # Proxies /api/* /media/* /health to backend
+├── playwright.config.ts      # UA matrix + smoke grep
+├── global-setup.ts           # admin login, rate-limit disable
+├── global-teardown.ts        # (no-op; use `npm run stack:down`)
+├── fixtures/
+│   ├── api-client.ts         # Typed wrapper over /api/v1/*
+│   ├── db.ts                 # Direct Postgres escape hatch (locked-PIN, etc.)
+│   ├── test.ts               # Central test.extend (guest, host, signIn fixtures)
+│   └── media/                # sample.jpg, sample.mp4, not-an-image.jpg
+├── helpers/
+│   ├── sse-listener.ts       # Async SSE iterator with waitForEvent()
+│   ├── storage-helpers.ts    # localStorage/sessionStorage helpers
+│   └── fake-media.ts         # Camera permissions (Chromium only)
+├── page-objects/
+│   ├── join-page.ts          # /join
+│   ├── recover-page.ts       # /recover
+│   ├── admin-login-page.ts   # /admin/login
+│   ├── feed-page.ts          # /feed + bottom nav
+│   ├── upload-sheet.ts       # UploadSheet.svelte + /upload
+│   ├── lightbox.ts           # LightboxModal.svelte
+│   ├── account-page.ts       # /account
+│   ├── host-dashboard.ts     # /host
+│   ├── admin-dashboard.ts    # /admin
+│   └── export-page.ts        # /export
+└── specs/
+    ├── __smoke/              # @smoke cross-UA matrix (1 spec)
+    ├── 01-auth/
+    ├── 02-upload/
+    ├── 03-feed/
+    ├── 04-host/
+    ├── 05-admin/
+    └── 06-export/
+```
+
+## Debugging a failure
+
+- `npm run test:e2e:ui` — interactive UI with time-travel and selector probe.
+- `npm run test:e2e:headed` — watch the browser run live.
+- `npm run test:e2e:debug` — Playwright inspector with breakpoints.
+- `npm run stack:logs` — tail backend + Postgres logs during a failure.
+- `playwright-report/index.html` — opens the HTML report (auto-generated on every run).
+- Trace files (`test-results/**/trace.zip`) drag-and-drop into `https://trace.playwright.dev`.
+
+## Conventions
+
+- **One assertion per `expect`**. Bundling multiple expects in one statement
+  loses the line-level failure context.
+- **Wait on data, not time**. Use `expect.poll` for DB checks; never `waitForTimeout` in production specs.
+- **`@smoke` tag** on each suite's happiest path so the matrix run stays under 2 min.
+- **`test.fixme`** for features that need infrastructure not yet built (Node-side multipart upload helper, real video fixtures, etc.). Fixme tests don't fail the suite but show up in the report.
+- **Page objects own selectors**. Specs never use raw locators.
+- **German text in assertions** is fine — it's not going to change frequently. When it does, the page object is the only file to update.
+
+## Roadmap
+
+### Phase 2 — Adversarial & browser chaos ✅ landed
+
+See the **What's tested** table above and the per-file breakdown.
+Known findings surfaced (documented in tests, not silent failures):
+
+1. `/admin/login` has no rate-limit or lockout — bcrypt cost is the only defense.
+2. `localStorage` 'storage' event is not listened for, so logout in tab A
+   doesn't synchronously sign out tab B (the next 401 from any API call
+   clears it).
+3. SVG uploads currently pass the magic-byte check (depends on `infer`'s
+   detection coverage) — consider adding `X-Content-Type-Options: nosniff`
+   + CSP on `/media/*` if SVGs are ever expected as user content.
+
+### Phase 3 — Mobile gestures (`specs/09-mobile/`) ✅ landed
+
+Runs only on the `chromium-mobile` project (Pixel 7 device descriptor with
+`hasTouch` and `isMobile`). The `chromium-desktop` project explicitly
+ignores this folder via `testIgnore` in [playwright.config.ts](playwright.config.ts).
+
+- **`touch-targets.spec.ts`** — 4 tests. Audits ≥ 44×44 px on bottom nav,
+  FAB, join submit, admin-login submit, PIN-modal buttons. Uses
+  `expect.soft` so a single failure surfaces the actual bounding-box
+  dimensions instead of stopping the suite.
+- **`safe-area.spec.ts`** — 4 tests. Asserts `env(safe-area-inset-bottom)`
+  is present in the inline style of every bottom-anchored UI element
+  (bottom nav, UploadSheet, ContextSheet), and that the nav stays flush
+  with the viewport bottom on a no-notch emulated device.
+- **`gestures-longpress.spec.ts`** — 3 tests. A 600 ms hold on a
+  FeedListCard opens the ContextSheet; a 200 ms tap does not; the
+  click-suppression logic prevents the lightbox from also opening at
+  pointer-up. Driven via `page.mouse.down/up` because the `longpress`
+  action listens for pointer events (mouse/touch/pen unified).
+- **`gestures-doubletap.spec.ts`** — 2 tests. Double-tap on a feed card
+  image button records a like; double-tap inside the lightbox triggers
+  the heart-burst animation and records a like. Assertions read the like
+  count back via `/api/v1/feed` so they don't couple to specific badge
+  markup.
+- **`viewport-reflow.spec.ts`** — 5 tests. Portrait, landscape, narrow
+  (320×568), phablet (480×1024) — each asserts the bottom nav is
+  visible, the FAB stays roughly centered, and there's no horizontal
+  overflow on `<html>`. Plus a rotation test that confirms auth survives
+  a viewport resize.
+- **`planned-gestures.spec.ts`** — 5 **`test.fixme`** stubs documenting
+  the contracts for gestures from journey §17 that aren't shipped yet
+  (lightbox swipe L/R, swipe-down to dismiss UploadSheet,
+  pull-to-refresh, long-press on a comment). Flip `test.fixme` to `test`
+  when wiring each gesture.
+
+#### Driving gestures: the `helpers/touch.ts` module
+
+- `longPress(page, locator, durationMs)` — holds the pointer down for
+  the duration. Default 600 ms beats the action's 500 ms threshold.
+- `doubleTap(page, locator)` — two `mouse.down/up` pairs within the
+  `doubletap` action's 300 ms window.
+- `swipe(page, from, to, steps)` — gradual mouse-driven move (used by
+  the fixme stubs once swipe gestures land).
+- `inlineStyle(locator)` / `computedStyle(locator, prop)` — read raw
+  `style` attributes (where `env(...)` strings live) and computed
+  values.
+
+### Phase 3 — Real-device compat & visual / a11y (not landed)
+- Long-press own/other post, swipe lightbox L/R, swipe-down dismiss, pull-to-refresh, double-tap like.
+- Safe-area inset visual diff on iPhone notch.
+- Touch-target ≥ 44 px audit.
+- Tier B Samsung Internet via `connectOverCDP` on Android Studio emulator.
+- Tier C BrowserStack integration (paid, optional).
+- `@axe-core/playwright` accessibility audits.
+- Visual regression with screenshot diffs.
+
+### Out of scope (handed to other tools)
+- Load testing → k6 / Vegeta.
+- API contract testing → backend `cargo test` integration tests.
+- Static asset auditing → Lighthouse CI.