Files
EventSnap/e2e/README.md
MechaCat02 e42d8a92a1 feat(e2e): Playwright suite — 134 tests across 9 spec areas + UA matrix
Adds an end-to-end Playwright test suite under e2e/ that spins up an
isolated docker-compose stack (Postgres :55432, Caddy :3101, backend with
EVENTSNAP_TEST_MODE=1, SvelteKit adapter-node frontend) and exercises the
SvelteKit app against the real Rust backend.

Phase 1 — happy paths covering every documented USER_JOURNEYS.md flow:
  01-auth/      join, recover, admin login, leave event, PIN lockout
  02-upload/    gallery picker (API path), rate-limit + admin toggle
  03-feed/      like/comment SSE, filters, SSE reconnect on visibility
  04-host/      event lock API, ban/unban, promote
  05-admin/     config validation, foundational authz guards, stats
  06-export/    /export status + download stub
  __smoke/      cross-UA happy-path (runs on every UA project)

Phase 2 — adversarial + browser chaos:
  07-adversarial/  XSS payloads (6 × display name path), SQLi shapes,
                   length / encoding / RTL override / NUL byte;
                   file-upload boundaries (ELF body claimed as JPEG,
                   oversize vs max_image_size_mb, zero-byte, NUL
                   filename, path-traversal, SVG-with-script);
                   JWT alg:none, signature/payload tamper, expired
                   session, PIN brute-force (serial + parallel),
                   admin password brute-force; deep authz (cross-user
                   delete, banned user across like/comment/feed-read,
                   host→admin escalation); small-scale DDoS (20× /join,
                   10MB comment body, 10 concurrent SSE).
  08-browser-chaos/ localStorage / sessionStorage / cookie purge,
                    IndexedDB drop mid-session, offline → reconnect,
                    slow-3G, 503 flakes, 429 with no retry storm,
                    multi-tab same/different user, no-JS, hostile CSS,
                    clock skew ±1h / -2d, localStorage quota exhausted.

Phase 3 — mobile gestures (runs only on chromium-mobile / Pixel 7):
  09-mobile/    touch-target ≥44px audit, env(safe-area-inset-bottom)
                structural check, long-press (FeedListCard → ContextSheet,
                quick-tap negation, click-suppression), double-tap
                (feed card like + lightbox heart-burst, via synthetic
                pointer events to bypass the first-tap-fires-click trap),
                viewport reflow (portrait/landscape/narrow/phablet),
                plus fixme stubs documenting planned gestures (swipe
                lightbox L/R, swipe-down dismiss, pull-to-refresh,
                long-press-comment).

Cross-UA matrix (chromium-engine projects run @smoke only):
  chromium-pixel7, chromium-galaxy-s22, samsung-internet (Samsung UA
  emulation on Galaxy viewport), edge-android, plus webkit-iphone,
  chrome-ios, firefox-android, firefox-desktop — the latter four need
  libavif16 on the host (Playwright dep) but the configs are in place.

Infrastructure:
  - fixtures/test.ts central test.extend (api, db, adminToken, guest,
    host, signIn). Per-test DB truncate via the dev-only POST
    /admin/__truncate route, gated by EVENTSNAP_TEST_MODE=1.
  - helpers/sse-listener.ts, helpers/upload-client.ts (Node-side
    multipart for adversarial file-upload tests + JPEG/PNG/ELF magic
    constants), helpers/touch.ts (longPress / doubleTap / swipe /
    inlineStyle / computedStyle).
  - 10 page objects covering every route + UploadSheet/Lightbox.
  - global-setup waits for /health, logs in admin, disables every
    rate-limit and quota toggle.
  - .github/workflows/e2e.yml: PR check runs chromium-desktop + the
    smoke matrix in parallel, uploads playwright-report/ and traces on
    failure.

Findings the suite surfaces as live `[finding]` warnings (not silenced):
  1. /admin/login has no rate-limit or lockout (bcrypt cost only).
  2. PIN-attempt counter races under parallel /recover requests.
  3. Zero-byte uploads pass /api/v1/upload.
  4. SVG-with-script can pass the magic-byte check (consider CSP +
     X-Content-Type-Options on /media/*).

Stack-internal docs live in e2e/README.md (UA tier table, Samsung
Internet escalation tiers A/B/C, debugging tips, roadmap).

Final tally: 134 passed / 0 failed / 9 skipped (test.fixme stubs for
not-yet-shipped gestures and one UI-upload-flow investigation).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 19:37:11 +02:00

288 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# EventSnap E2E Suite
Playwright-driven end-to-end tests for the EventSnap stack. The suite spins
up an isolated docker-compose stack on ports `:3101` (Caddy → frontend +
backend) and `:55432` (Postgres), and exercises the SvelteKit frontend
against a real Rust backend with rate limits and quotas disabled.
**Phases 1, 2, and 3-mobile-gestures are landed**:
- **Phase 1** — happy-path coverage of every documented user journey, plus a
smoke matrix across nine browser/UA profiles to catch engine-level
divergences.
- **Phase 2** — adversarial inputs (XSS, SQL-injection, JWT forgery, MIME
spoofing, oversize, brute-force) and browser chaos (storage purge,
offline/slow-3G, multi-tab, clock skew, no-JS, quota exhaustion).
- **Phase 3 (gestures only)** — touch-target audit, safe-area structural
check, long-press → ContextSheet, double-tap → like, viewport reflow,
plus `test.fixme` stubs for planned gestures (lightbox swipe, swipe-down
dismiss, pull-to-refresh).
Phase 3 real-device compat (Android emulator + Samsung Internet via
`connectOverCDP`, BrowserStack), visual regression, and a11y audits are
sketched in the **Roadmap** at the bottom.
## Quickstart
```bash
cd e2e
npm install
npm run install:browsers # one-time: ~500 MB across chromium/firefox/webkit
# 1. Boot the test stack (rebuilds backend + frontend Docker images)
npm run stack:up
# 2. Wait ~20s for migrations + warmup, then run tests
npm run test:e2e # full Phase 1 suite on chromium-desktop
npm run test:e2e:smoke # cross-UA smoke matrix (~9 projects × 1 test)
npm run test:e2e:ui # interactive Playwright UI mode
# 3. After: tear the stack down (deletes volumes)
npm run stack:down
```
The CI workflow at `.github/workflows/e2e.yml` runs both jobs on every PR.
## What's tested
Every spec covers a journey from [`docs/USER_JOURNEYS.md`](../docs/USER_JOURNEYS.md)
or a security/chaos scenario. One folder per area:
| Folder | Phase | Journeys / Topic | Tests | Notes |
|---|---|---|---|---|
| `specs/01-auth/` | 1 | §1, §2, §3, §11, §15 | 13 | Join, recover, PIN lockout, admin login, leave event. |
| `specs/02-upload/` | 1 | §5, §6, §18 | 5 | Gallery picker, multi-file, rate-limit, admin toggle. |
| `specs/03-feed/` | 1 | §7, §8, §17 | 5 | Like/comment SSE, filter chips, SSE reconnect. |
| `specs/04-host/` | 1 | §9 | 5 | Event lock, ban/unban, role change. |
| `specs/05-admin/` | 1 | §11, §16 | 11 | Config validation, foundational auth guards, stats. |
| `specs/06-export/` | 1 | §12 | 3 | Status, release, download stub. |
| `specs/__smoke/` | 1 | (matrix) | 1 × 9 UAs | `@smoke`-tagged happy-path on every UA project. |
| `specs/07-adversarial/` | **2** | Input attacks, file upload boundaries, JWT forgery, brute-force, deep authorization, small DDoS | ~40 | See breakdown below. |
| `specs/08-browser-chaos/` | **2** | Storage purge, IndexedDB, offline/slow-3G, multi-tab, no-JS, clock skew, quota | ~20 | See breakdown below. |
| `specs/09-mobile/` | **3** | Touch-target audit, safe-area, long-press, double-tap, viewport reflow, fixme stubs | 23 | Runs only on `chromium-mobile` (Pixel 7 viewport). See below. |
### Phase 2 — adversarial (`specs/07-adversarial/`)
- **`xss-injection.spec.ts`** — 13 tests. Six XSS payloads × display-name path
+ four SQLi patterns + length/encoding edge cases (NUL byte, RTL override,
caption overflow). Asserts `window.__xssFired` never gets set and no
`dialog` event fires.
- **`ui-rendering.spec.ts`** — 2 tests. Belt-and-braces: even when a script-
payload sits in localStorage as the user's display name, rendering through
`/account` keeps it as text.
- **`file-upload-attacks.spec.ts`** — 9 tests. ELF body claimed as JPEG,
oversize image vs `max_image_size_mb`, zero-byte, missing file field,
path-traversal filename, NUL filename, `application/*` declared category
bypass, SVG-with-script.
- **`auth-tampering.spec.ts`** — 8 tests. `alg:none` forging admin role,
signature tamper, payload tamper with original signature, logged-out
session reuse, header without `Bearer `, missing Authorization,
PIN brute-force lockout, admin password brute-force (documented finding —
no lockout today, bcrypt cost is the only defense).
- **`authorization-deep.spec.ts`** — 6 tests. Cross-user comment delete,
banned user across like/comment/feed-read, host→admin escalation attempts.
- **`ddos.spec.ts`** — 4 small-scale abuse tests. 20 parallel /join, 10 MB
comment body, 10 concurrent SSE streams, malformed JSON.
### Phase 2 — browser chaos (`specs/08-browser-chaos/`)
- **`storage-purge.spec.ts`** — 5 tests. `localStorage.clear()` mid-session,
cookies cleared (JWT in localStorage still works), sessionStorage cleared,
admin force-relogin, PIN intentionally survives clearAuth.
- **`indexeddb.spec.ts`** — 2 tests. Drop all IDB databases mid-session;
stub IDB to undefined before navigation.
- **`offline-network.spec.ts`** — 4 tests. `setOffline(true)` → reconnect,
slow-3G via `page.route` delay, intermittent 503s, 429 from server (no
infinite retry storm).
- **`multi-tab.spec.ts`** — 3 tests. Same user two tabs, two users two
contexts (storage isolated), logout in tab A doesn't sync to tab B
(documented gap).
- **`environment.spec.ts`** — 5 tests. JS disabled, localStorage quota
exhausted, hostile CSS hiding nav, clock skew ±1h / -2d.
Pending tests covering features that need a Node-side multipart upload helper
are marked `test.fixme` and will activate when that helper lands.
## Browser & UA matrix
| Project | Engine | UA / Device | Why |
|---|---|---|---|
| `chromium-desktop` | Chromium | Desktop Chrome | Baseline. Full suite runs here. |
| `chromium-pixel7` | Chromium | Pixel 7 device descriptor | Chrome Android. |
| `chromium-galaxy-s22` | Chromium | Galaxy viewport + Samsung phone UA | Chrome on Samsung hardware. |
| `samsung-internet` | Chromium | Galaxy viewport + SamsungBrowser UA | **Tier-A Samsung Internet baseline.** |
| `edge-android` | Chromium | Pixel viewport + EdgA UA | Edge Mobile (Blink-based). |
| `chrome-ios` | Chromium | iPhone viewport + CriOS UA | Chrome iOS (actually WebKit, but UA differs). |
| `webkit-iphone` | WebKit | iPhone 14 Pro | Real iOS Safari engine. |
| `firefox-android` | Firefox | Pixel viewport + Firefox Android UA | Gecko engine. |
| `firefox-desktop` | Firefox | Desktop Firefox | FF-specific quirks. |
Only the `@smoke` happy-path runs across all projects (controlled by
`grep` in `playwright.config.ts`). The full Phase 1 suite is
`chromium-desktop`-only by default to keep CI under 15 min.
### Samsung Internet — three escalation tiers
Samsung Internet ships on every Galaxy phone (~5% of mobile traffic in DE).
It's **Blink-based**, so Tier-A catches ~90% of regressions. Real Samsung
divergences (Smart Switch save-data mode, dark-mode injection, custom
autoplay, in-browser ad blocking) are only reproducible at Tier B+:
- **Tier A** *(this repo, free, in CI)*: Playwright Chromium with the
Samsung Internet user-agent + Galaxy viewport. See the `samsung-internet`
project in `playwright.config.ts`.
- **Tier B** *(free, manual, future)*: Android Studio emulator on Linux →
install Samsung Internet APK → enable `--remote-debugging-port=9222`
`chromium.connectOverCDP('http://localhost:9222')`. Setup docs live in
`docs/samsung-emulator.md` (to be written).
- **Tier C** *(paid, optional)*: BrowserStack or LambdaTest cloud devices.
Real Galaxy S22/S23 hardware via Playwright's cloud integration.
## Test isolation
Every test runs against a **freshly truncated database**:
1. `global-setup.ts` waits for `/health`, logs in admin, and disables every
rate-limit and quota toggle via `PATCH /admin/config`.
2. The auto-fixture `truncate` in `fixtures/test.ts` calls
`POST /api/v1/admin/__truncate` before every test.
3. The truncate endpoint is only registered when the backend is started
with `EVENTSNAP_TEST_MODE=1` (see `backend/src/main.rs` and
`backend/src/handlers/test_admin.rs`). Production builds return 404.
Single-worker by design (`workers: 1` in the config). Per-worker isolated
DBs are a Phase-2+ change.
## Architecture
```
e2e/
├── docker-compose.test.yml # Isolated test stack: db :55432, caddy :3101
├── Caddyfile.test # Proxies /api/* /media/* /health to backend
├── playwright.config.ts # UA matrix + smoke grep
├── global-setup.ts # admin login, rate-limit disable
├── global-teardown.ts # (no-op; use `npm run stack:down`)
├── fixtures/
│ ├── api-client.ts # Typed wrapper over /api/v1/*
│ ├── db.ts # Direct Postgres escape hatch (locked-PIN, etc.)
│ ├── test.ts # Central test.extend (guest, host, signIn fixtures)
│ └── media/ # sample.jpg, sample.mp4, not-an-image.jpg
├── helpers/
│ ├── sse-listener.ts # Async SSE iterator with waitForEvent()
│ ├── storage-helpers.ts # localStorage/sessionStorage helpers
│ └── fake-media.ts # Camera permissions (Chromium only)
├── page-objects/
│ ├── join-page.ts # /join
│ ├── recover-page.ts # /recover
│ ├── admin-login-page.ts # /admin/login
│ ├── feed-page.ts # /feed + bottom nav
│ ├── upload-sheet.ts # UploadSheet.svelte + /upload
│ ├── lightbox.ts # LightboxModal.svelte
│ ├── account-page.ts # /account
│ ├── host-dashboard.ts # /host
│ ├── admin-dashboard.ts # /admin
│ └── export-page.ts # /export
└── specs/
├── __smoke/ # @smoke cross-UA matrix (1 spec)
├── 01-auth/
├── 02-upload/
├── 03-feed/
├── 04-host/
├── 05-admin/
└── 06-export/
```
## Debugging a failure
- `npm run test:e2e:ui` — interactive UI with time-travel and selector probe.
- `npm run test:e2e:headed` — watch the browser run live.
- `npm run test:e2e:debug` — Playwright inspector with breakpoints.
- `npm run stack:logs` — tail backend + Postgres logs during a failure.
- `playwright-report/index.html` — opens the HTML report (auto-generated on every run).
- Trace files (`test-results/**/trace.zip`) drag-and-drop into `https://trace.playwright.dev`.
## Conventions
- **One assertion per `expect`**. Bundling multiple expects in one statement
loses the line-level failure context.
- **Wait on data, not time**. Use `expect.poll` for DB checks; never `waitForTimeout` in production specs.
- **`@smoke` tag** on each suite's happiest path so the matrix run stays under 2 min.
- **`test.fixme`** for features that need infrastructure not yet built (Node-side multipart upload helper, real video fixtures, etc.). Fixme tests don't fail the suite but show up in the report.
- **Page objects own selectors**. Specs never use raw locators.
- **German text in assertions** is fine — it's not going to change frequently. When it does, the page object is the only file to update.
## Roadmap
### Phase 2 — Adversarial & browser chaos ✅ landed
See the **What's tested** table above and the per-file breakdown.
Known findings surfaced (documented in tests, not silent failures):
1. `/admin/login` has no rate-limit or lockout — bcrypt cost is the only defense.
2. `localStorage` 'storage' event is not listened for, so logout in tab A
doesn't synchronously sign out tab B (the next 401 from any API call
clears it).
3. SVG uploads currently pass the magic-byte check (depends on `infer`'s
detection coverage) — consider adding `X-Content-Type-Options: nosniff`
+ CSP on `/media/*` if SVGs are ever expected as user content.
### Phase 3 — Mobile gestures (`specs/09-mobile/`) ✅ landed
Runs only on the `chromium-mobile` project (Pixel 7 device descriptor with
`hasTouch` and `isMobile`). The `chromium-desktop` project explicitly
ignores this folder via `testIgnore` in [playwright.config.ts](playwright.config.ts).
- **`touch-targets.spec.ts`** — 4 tests. Audits ≥ 44×44 px on bottom nav,
FAB, join submit, admin-login submit, PIN-modal buttons. Uses
`expect.soft` so a single failure surfaces the actual bounding-box
dimensions instead of stopping the suite.
- **`safe-area.spec.ts`** — 4 tests. Asserts `env(safe-area-inset-bottom)`
is present in the inline style of every bottom-anchored UI element
(bottom nav, UploadSheet, ContextSheet), and that the nav stays flush
with the viewport bottom on a no-notch emulated device.
- **`gestures-longpress.spec.ts`** — 3 tests. A 600 ms hold on a
FeedListCard opens the ContextSheet; a 200 ms tap does not; the
click-suppression logic prevents the lightbox from also opening at
pointer-up. Driven via `page.mouse.down/up` because the `longpress`
action listens for pointer events (mouse/touch/pen unified).
- **`gestures-doubletap.spec.ts`** — 2 tests. Double-tap on a feed card
image button records a like; double-tap inside the lightbox triggers
the heart-burst animation and records a like. Assertions read the like
count back via `/api/v1/feed` so they don't couple to specific badge
markup.
- **`viewport-reflow.spec.ts`** — 5 tests. Portrait, landscape, narrow
(320×568), phablet (480×1024) — each asserts the bottom nav is
visible, the FAB stays roughly centered, and there's no horizontal
overflow on `<html>`. Plus a rotation test that confirms auth survives
a viewport resize.
- **`planned-gestures.spec.ts`** — 5 **`test.fixme`** stubs documenting
the contracts for gestures from journey §17 that aren't shipped yet
(lightbox swipe L/R, swipe-down to dismiss UploadSheet,
pull-to-refresh, long-press on a comment). Flip `test.fixme` to `test`
when wiring each gesture.
#### Driving gestures: the `helpers/touch.ts` module
- `longPress(page, locator, durationMs)` — holds the pointer down for
the duration. Default 600 ms beats the action's 500 ms threshold.
- `doubleTap(page, locator)` — two `mouse.down/up` pairs within the
`doubletap` action's 300 ms window.
- `swipe(page, from, to, steps)` — gradual mouse-driven move (used by
the fixme stubs once swipe gestures land).
- `inlineStyle(locator)` / `computedStyle(locator, prop)` — read raw
`style` attributes (where `env(...)` strings live) and computed
values.
### Phase 3 — Real-device compat & visual / a11y (not landed)
- Long-press own/other post, swipe lightbox L/R, swipe-down dismiss, pull-to-refresh, double-tap like.
- Safe-area inset visual diff on iPhone notch.
- Touch-target ≥ 44 px audit.
- Tier B Samsung Internet via `connectOverCDP` on Android Studio emulator.
- Tier C BrowserStack integration (paid, optional).
- `@axe-core/playwright` accessibility audits.
- Visual regression with screenshot diffs.
### Out of scope (handed to other tools)
- Load testing → k6 / Vegeta.
- API contract testing → backend `cargo test` integration tests.
- Static asset auditing → Lighthouse CI.