Files
xenia-rs/audit-runs/audit-059-handle-disambiguation/round-C1-setter-validation/FINDINGS.md
MechaCat02 481591fdb2 [AUDIT-059 R-C1] Phase C: bit-28 setter hypothesis REFUTED via dump-addr
Phase A's diagnosis (bit 28 of [0x40d09a40] gets set to exit
sub_822F1AA8's loop) is falsified by direct probe + --dump-addr in 4
sub-rounds.

Key evidence:
- sub_821B55D8 candidate fn fires 0× in ours; sub_824AA858
  (XamInputSetState wrapper) fires 0× in canary too — chain is dead code
  in both engines.
- end-of-run dump shows [0x40d09a40+0] = 0x00000021, same as at entry —
  bit 28 is NEVER set.
- bcctrl at PC 0x822F1B4C (sub_822F1AA8+0xA4) fires (LR=0x822F1B50) but
  the post-bcctrl BB head 0x822F1B50 fires 0× — bcctrl never returns.
- sub_82173990 (vtable[0] of singleton at [0x828E1F08]) is the call
  target; tid=1 wedges inside this 768-byte function on a thread-join
  to handle 0x1070 (= tid=13's thread handle).
- tid=13 (entry=sub_821748F0, ctx=0x4024a840, handle=0x1070) reaches
  sub_821C4EB0 (silph::UImpl@GamePart_Title) at cycle 1882 → audit-049
  cluster IS reached, wedges on handle 0x1078 there.

C.2 force-clear POC NOT EXECUTED — would be no-op since bit 28 is never
set. Per plan stopping criterion, hand back instead of proceeding blind.

Adds reading-error class #19: disasm-pattern-match without runtime
verification (Phase A scanned 49 oris-0x1000 sites and declared one the
setter without ever observing the bit get set).

No xenia-rs source changes. Canary repo also unchanged (config edit
reverted clean).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-11 17:57:27 +02:00

128 lines
8.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Phase C.1 — Validation refutes Phase A's bit-28 setter hypothesis
## TL;DR
Phase A claimed: "bit 28 of `[0x40d09a40]` (controller word) gets set in ours, causing sub_822F1AA8's dispatcher loop to exit early; candidate setter is `sub_821B55D8` at PC `0x821B5DA4`."
**Phase C.1 falsifies this in 4 sub-rounds:**
1. **`sub_821B55D8` is dead code** in both engines — its `XamInputSetState` wrapper `sub_824AA858` fires 0× in both.
2. **`[0x40d09a40]` is never set to anything with bit 28** — `--dump-addr` at end of run shows `+0x00 = 0x00000021`, the entry value. Bit 28 is NEVER set.
3. **The actual wedge is at the `bcctrl` at PC `0x822F1B4C`** (inside sub_822F1AA8 setup, BEFORE the dispatcher loop). tid=1 never reaches the loop top-check.
4. **The bcctrl calls `sub_82173990`** (vtable[0] of the dispatcher singleton at `[0x828E1F08]`), which eventually waits for tid=13 to terminate. tid=13 wedges in the audit-049 silph::UImpl@GamePart_Title chain on handle `0x1078`.
The C.2 force-clear POC (the planned next step) would have **zero effect** because bit 28 is never set. Skipped per plan stopping criterion.
## Probe-fire counts (ours, 50M-instr parallel)
| PC | sub-round | fires | meaning |
|---|---|---|---|
| `0x821B55D8` (Phase A candidate fn entry) | 1 | **0** | function never reached → β/γ |
| `0x821B5D98,DA0,DAC,D48` (loop BB heads) | 1 | **0** | function never reached |
| `0x822F1AA8` (sub_822F1AA8 entry) | 2,3,4 | 2-3 | reached |
| `0x822F1B38` (post-`bl 0x824AA8B0`) | 4 | 2 | reached |
| `0x822F1B50` (post-`bcctrl`) | 4 | **0** | **bcctrl never returns** |
| `0x822F1B60,B78,B80,BBC` (loop setup/top) | 3 | 0 | unreachable past bcctrl |
| `0x822F1E10` (loop exit cleanup) | 2 | 0 | loop never entered, never exited |
| `0x822F1E34` (post-thread-join) | 2 | 0 | never reached |
| `0x82173990` (vtable[0] target) | 4 | 2 | called via bcctrl, r3=singleton (LR=0x822F1B50) |
| `0x821748F0` (tid=13 entry) | 4 | 2 | tid=13 runs |
| `0x821C4EB0` (silph::UImpl@GamePart_Title) | 4 | 2 | audit-009/049 reached on tid=13 |
| `0x82457388,0x824574C0,0x82457408,0x82457490` (other oris candidates) | 2 | 0 | unreachable |
## Canary probe results
| PC | fires | meaning |
|---|---|---|
| `0x824AA858` (XamInputSetState wrapper) | **0** | sub_821B55D8 chain is dead code in CANARY too |
| `0x822F1B50` (post-bcctrl, attempted) | **0** | canary's JitProlog only fires at function entries, so not directly testable; but per audit round-33 sub_821741C8 fires 471× in canary → bcctrl DOES return in canary |
## Critical evidence: `--dump-addr=0x40d09a40` at end of run
```
addr=0x40d09a40
+0x00: 00 00 00 21 00 00 00 01 42 44 df 00 40 54 1a 40
^^^^^^^^^^^ ^^^^^^^^^^^
+0x10: 40 54 1b 40 40 54 1b 80 40 54 1b c0 00 00 10 54
+0x20: 00 00 00 00 40 24 a8 20 00 00 00 08 00 00 00 00
```
- `[+0x00] = 0x00000021` ← bit 28 (mask 0x10000000) is NOT SET. Same value as at sub_822F1AA8 entry.
- `[+0x1c] = 0x00001054` ← spawned init thread handle (= tid=8's thread handle, NOT 0x1070)
- Thread state: tid=1 waits on handle `0x1070`, tid=13 waits on handle `0x1078`.
Handle `0x1070` is **tid=13's thread handle** (per stderr: `ExCreateThread: tid=13 handle=0x1070 entry=0x821748f0 ctx=0x4024a840 suspended=true`). So tid=1's wait at the wedge point is a **thread-join on tid=13**, NOT a thread-join on the dispatcher init thread (tid=8, handle 0x1054).
## Wedge path (corrected)
```
entry_point (sub_824AB748) [tid=1 main]
└─ sub_8216EA68
└─ sub_822F1AA8(controller=0x40d09a40) [LR=0x8216EE14]
├─ ExCreateThread(entry=sub_822F1EE0, ctx=controller) [PC 0x822F1B08]
│ ⇒ tid=8 spawn, handle=0x1054 (suspended)
├─ bl 0x824AA8B0 (no-op probe) [PC 0x822F1B34]
└─ bcctrl on vtable[+0] of [0x828E1F08] singleton [PC 0x822F1B4C]
└─ sub_82173990(r3=singleton) [r3=0x40ba9a80, vtable=0x40111910]
└─ ... (768-byte function with ≥18 calls; calls sub_82448AA0, sub_824AA7A0,
sub_82448BC8, sub_82448C50, sub_8216F218, sub_8217C850, sub_82178E50,
sub_821835E0, ...)
└─ ... → KeWaitForSingleObject INFINITE on handle 0x1070
(= tid=13's thread handle, thread-join)
⇒ WEDGE — tid=13 never exits
(Concurrently — spawned somewhere else, not from sub_822F1AA8:)
[tid=13, spawn-handle=0x1070, ctx=0x4024a840]
└─ sub_821748F0 (worker boilerplate, entry from ExCreateThread)
├─ sub_82172798, sub_82172818
└─ sub_821749C0
└─ sub_821CF3F0
└─ ... → sub_821C4EB0 (UImpl@GamePart_Title@silph) [audit-009/049!]
└─ ... → sub_821CB030 (creates KEVENT at +0x128)
⇒ KeWaitForSingleObject INFINITE on handle 0x1078
⇒ WEDGE — handle 0x1078 is never signaled in ours
```
## Why Phase A's hypothesis is wrong
Phase A:
1. Disassembled sub_822F1AA8's body, observed the bit-28 loop-exit check at `0x822F1BB8` and end-of-iter check at `0x822F1E0C`.
2. Mem-watch on `0x40d09a40` showed zero stores → inferred "the setter writes via some path mem-watch doesn't capture."
3. DB-scanned `oris ?, ?, 0x1000` (49 sites), found `sub_821B55D8 + 0x821B5DA4` with pattern `bl sub_824AA858 ; if r3 == 0xAA: oris r11, 0x1000 ; stw`.
4. Concluded `sub_821B55D8` was the setter.
What Phase A missed:
- Mem-watch's 0-stores result was correct: **NO setter exists**. Bit 28 is never set in either engine. The mem-watch null-result was a hint that the bit-28 hypothesis itself was wrong, but Phase A interpreted it as "mem-watch misses something."
- The disasm-based hypothesis was visually compelling (a loop iterating arrays and setting bit 28 when a kernel call returns 0xAA) but never verified runtime.
- `sub_821B55D8` is itself dead code in both engines.
## Reading-error class #19: disasm-pattern-match without runtime verification
When scanning for a hypothesized signal source via DB pattern-match (`oris ?, ?, 0x1000`), the analyst must run a probe to verify the suspected site is *both reached* and *takes the suspected path* before declaring it the cause. Phase A bypassed both checks. The single `--dump-addr=0x40d09a40` flag in sub-round 2 (literally 4 keystrokes added to the existing probe command) revealed the central assumption was wrong.
## Real divergence (handed to next session)
This is the **same wedge as audit-049/058/059**: tid=13 wedges in the silph::UImpl@GamePart_Title cluster on handle `0x1078`. tid=1 wedges on tid=13's thread-handle (`0x1070`) inside `sub_82173990`'s call chain.
`sub_82173990` is vtable[0] of the dispatcher singleton at `[0x828E1F08]`. It's a 768-byte function with ≥18 calls; the actual wait site is somewhere down its tree. To localize where in `sub_82173990` the wait happens, probe its BB heads + the `KeWaitForSingleObject` thunks (`sub_824AA330`, `sub_824AA708`).
The fix-shape is **NOT** "force-clear bit 28." The fix-shape is **"signal handle 0x1078 in the audit-049 cluster, or short-circuit tid=13's wait."** Round 22 (silph_synth.rs) attempted the cluster-A version of this. Cluster B (silph::UImpl) needs its own synthesis or a kernel-side signal of handle 0x1078.
## Phase C verdict
- C.1: 4 sub-rounds executed (within budget).
- C.2: **NOT EXECUTED** — POC would be no-op since bit 28 is never set. Per plan stopping criterion, do not proceed to C.2 blind when C.1 refutes the diagnosis.
- C.3: not applicable.
- Branch state: no source changes. Audit artifacts only.
## Files in this directory
- `ours-c1-probe.log/stderr` — sub-round 1, probe at sub_821B55D8 BB heads (0 fires)
- `ours-sr2-confirm-bit28.log/stderr` — sub-round 2, probe loop top/exit + dump-addr (bit 28 NEVER SET)
- `ours-sr3-wait-trace.log/stderr` — sub-round 3, probe wait site + handle 0x1070 trace
- `ours-sr4-bcctrl-trace.log/stderr` — sub-round 4, probe pre/post bcctrl + sub_82173990 entry + tid=13 entry (decisive)
- canary side in `../round-C1-setter-validation-canary/`:
- `canary-824AA858.log` — XamInputSetState wrapper fires 0× in canary too
- `canary-822F1B50.log` — JitProlog can't probe at BB-internal PCs (function-entry-only)