--- name: EDRAM→memory resolve byte copy — status and remaining gaps description: What now ships on TILE_FLUSH, which paths still fall through to skip/warn, and the Canary anchors for future expansion type: project originSessionId: c0486ac0-d44e-49fc-8a8d-6c28cb11ab9d --- ## Status (post 2026-04-22 landing) `handle_event_initiator` at [gpu_system.rs:376+](../../../../xenia-rs/crates/xenia-gpu/src/gpu_system.rs) now **writes bytes into guest memory** on TILE_FLUSH. End-to-end flow: 1. `ResolveInfo::from_register_file` ([draw_state.rs](../../../../xenia-rs/crates/xenia-gpu/src/draw_state.rs)) decodes `RB_COPY_CONTROL / RB_COPY_DEST_*` + `RB_SURFACE_INFO / RB_COLOR_INFO_* / RB_DEPTH_INFO / RB_COLOR_CLEAR / RB_COLOR_CLEAR_LO / RB_DEPTH_CLEAR` into a full Canary-parity struct: rectangle (scissor ∩ dest_pitch, 8-pixel aligned), source base tiles, surface_pitch_tiles (via `GetSurfacePitchTiles`), MSAA, 64bpp flag, clear values, dest_base **masked to `0x1FFF_FFFF`**, Endian128, format, array flag. 2. `ShadowEdram` ([edram.rs](../../../../xenia-rs/crates/xenia-gpu/src/edram.rs)) — 10 MiB (2048 × 80 × 16 samples × 4 B) CPU-side EDRAM that holds per-tile bytes. Clear-resolves paint `RB_COLOR_CLEAR` into the source tiles via `fill_rect_32bpp`; the copy loop reads out via `read_sample_32bpp`. 3. `resolve::copy_to_memory` ([resolve.rs](../../../../xenia-rs/crates/xenia-gpu/src/resolve.rs)) — per-pixel loop. For `k_8_8_8_8` source + dest (bitwise-equivalent fast path) it applies `apply_endian_128` and calls `mem.write_u32(tiled_2d_offset(x, y, pitch_aligned_to_32, bpp_log2=2))` — page versions bump so `texture_cache_host.rs` re-uploads on next bind. 4. `stats.resolves_copied_total` + `resolves_skipped_total` + `resolve_samples_written` flow to the HUD row 2. ## What's supported (expanded coverage) - **Color sources** (any of these → any compatible color dest): - `k_8_8_8_8` (0), `k_8_8_8_8_GAMMA` (1) → `k_8_8_8_8` (6), `k_8_8_8_8_A` (14), `k_8_8_8_8_AS_16_16_16_16` (50). - `k_2_10_10_10` (2), `k_2_10_10_10_AS_10_10_10_10` (10) → `k_2_10_10_10` (7), `k_2_10_10_10_AS_16_16_16_16` (54). - `k_16_16_FLOAT` (6) → `k_16_16_FLOAT` (31). - `k_32_FLOAT` (14) → `k_32_FLOAT` (36). - Gated by `is_32bpp_bitwise_equivalent` ([resolve.rs](../../../../xenia-rs/crates/xenia-gpu/src/resolve.rs)) mirroring Canary `IsColorResolveFormatBitwiseEquivalent` (xenos.h:614). - **Depth sources**: `kD24S8` (0) → `k_24_8` (22); `kD24FS8` (1) → `k_24_8_FLOAT` (23). Reads depth tiles at `RB_DEPTH_INFO.depth_base`. - **Rectangle derivation**: vertex-fetch-constant-0 when present (6-dword vertex buffer with endian-decoded floats, Fixed16p8 rounding, 3-vertex bounding box per Canary `draw_util.cc:950-1028`). Falls back to scissor ∩ `(0, 0, dest_pitch, dest_height)` when VF0 isn't a valid resolve vertex buffer. All outputs 8-pixel-aligned via `RESOLVE_ALIGNMENT_PIXELS = 8`. - **`CopySampleSelect` sanitation** (`xenos.h:1039-1052`): MSAA + depth remap invalid selectors. Single-sample picks (`k0/k1/k2/k3`) honored; averaging modes (`k01/k23/k0123`) pick sample 0 + log `warn` (full averaging TODO). - Endian: `kNone`, `k8in16`, `k8in32`, `k16in32` all correct. `k8in64`/`k8in128` approximated as `k8in32` + `tracing::warn`. - Clear-resolve + copy-resolve paths both work. - Destination address masked to Xenon 29-bit physical space. ## What logs + skips (graceful) All of the below return `resolves_skipped_total += 1` with a `tracing::warn` identifying the reason — boot continues: - 64bpp source (`k_16_16_16_16`, `k_16_16_16_16_FLOAT`, `k_32_32_FLOAT`). - 3D/stacked destination (`copy_dest_array = 1`) — Canary `Tiled3D` not ported. - Non-zero `dest_exp_bias` on linear formats. - Non-bitwise-equivalent source/dest pair (e.g. `k_16_16` → `k_16_16`, which would need conversion tables). ## Deferred (next-session backlog, ordered by ROI) **Small, bounded — take these first:** 1. **MSAA sample averaging** (`CopySampleSelect::k01/k23/k0123`). Today falls back to sample 0 + `warn`. Fix: read N samples, average by format-aware rule (unorm8 averaged as int, float averaged as float). Needs per-format decoder. 2. **64bpp source** (`k_16_16_16_16`, `k_16_16_16_16_FLOAT`, `k_32_32_FLOAT`). Skipped + logged. Needs double-tile EDRAM stride (`pitch_tiles << is_64bpp`) and two `write_u32` per pixel. Straightforward refactor of `resolve::copy_to_memory`. 3. **`RB_COLOR_CLEAR_LO` for 64bpp clear paint**. Already captured in `ResolveInfo` but `fill_rect_32bpp` only writes one lane. Companion to #2. 4. **Endian `k8in64` / `k8in128`** (properly). Approximated as `k8in32` today. Buffer pixels in pairs/quads before tile-write. Rare in practice. 5. **`copy_dest_exp_bias != 0`**. Skipped + logged. Needs float-format awareness; bake the scale factor into the sample converter. **Large lifts — their own sessions:** 6. **wgpu render-target readback into `ShadowEdram`**. The clear-then-resolve path works, but once Sylpheed *draws* (currently `first Xenos draw: 0`), drawn pixels never reach EDRAM because the draw pipeline writes to wgpu attachments, not the shadow. Needs async `copy_texture_to_buffer` + CPU retile. Probably what unblocks frame-2 and beyond. 7. **3D / array destinations** (`copy_dest_array = 1`). Needs Canary's `Tiled3D` + `GetTiledOffset3D` ported. Rare on first-pixels path. 8. **Non-bitwise-equivalent conversion** — e.g. `k_16_16` RT (signed, range [-32, 32]) → `k_16_16` texture (unsigned). Requires Canary's conversion shader tables (`draw_util.cc:1320-1391` shader selection). ## Canary anchors (for future expansion) - [draw_util.cc:926-1318](../../../../xenia-canary/src/xenia/gpu/draw_util.cc) — full `GetResolveInfo` including vertex-fetch rect. - [draw_util.cc:1320-1391](../../../../xenia-canary/src/xenia/gpu/draw_util.cc) — shader selection (fast vs full paths). - [render_target_cache.cc:1045](../../../../xenia-canary/src/xenia/gpu/render_target_cache.cc) — `GetResolveCopyRectanglesToDump` for host-RT dump. - [texture_address.h:190-260](../../../../xenia-canary/src/xenia/gpu/texture_address.h) — `Tiled3D` (for copy_dest_array). - [xenos.h:1039-1047](../../../../xenia-canary/src/xenia/gpu/xenos.h) — `SanitizeCopySampleSelect` for the MSAA sample-select rules.