[AUDIT-059 R-A] Phase A backward-trace: divergence is sub_822F1AA8 loop exit, not factory/registry
Round-37 anchor reframe: both engines install the SAME static .rdata vtable
0x820A183C at [0x828E1F08]. Instance VAs differ only because of ε-class
allocator divergence (audit-043). vtable bytes byte-identical; the user
prompt's "factory/registry" framing was falsified.
Phase A walkthrough (rounds A1..A8):
- A.1 canary --audit_jit_prolog_pc=0x821741C8: tid=6, r3=0xBCCC4A80 (= inner
sub-object of [0x828E1F08]'s singleton), LR=0x822F1D5C (return-from-bctrl
inside sub_822F1AA8)
- A.2 found tid=6 spawn site sub_821746B0 at PC 0x82174824 spawning
entry=sub_821748F0 ctx=BC365700/BC366DA0. sub_822F1AA8 ALSO spawns a
second thread (entry=sub_822F1EE0 ctx=BCE24A40) at PC 0x822F1B08
- A.3 sub_822F1AA8 has 2 callers, both in sub_8216EA68 (its sole caller is
sub_824AB748 = entry_point)
- A.4 ours mirror probe: sub_821746B0 enters, [0x828E2B14] gate passes,
ExCreateThread fires returning handle 0x1070 (= tid=13). Ours' tid=13
IS the same logical thread as canary's spawned silph initializer
- A.5 canary --audit_jit_prolog_pc=0x821749C0: fires only 2× on short-lived
tid=17, tid=26 (the spawned initializers — NOT tid=6)
- A.6 canary --audit_jit_prolog_pc=0x822F1AA8: fires 1× on tid=6 with
r3=0xBCE24A40 LR=0x8216EE14 (the second sub_822F1AA8 call site)
- A.7 canary --audit_jit_prolog_pc=0x824AB748 (entry_point): fires on
tid=00000006. CONFIRMS canary's tid=6 = canary's main thread.
Verdict: identical call chain entry_point → sub_8216EA68 → sub_822F1AA8 in
both engines; same controller (ε-divergent VA, byte-identical fields).
Canary's main thread stays in sub_822F1AA8's dispatcher loop firing
sub_821741C8 ~1678×/30s. Ours' main thread exits the loop and thread-joins
on the spawned initializer (tid=13), which is itself wedged on handle 0x1078
forever.
Loop exit is gated by bit 28 of [r30+0] (the controller's flag word). Same
value 0x21 at function entry in both engines. Some code between entry and
loop check sets bit 28 in ours but not in canary. Mem-watch on 0x40d09a40
shows zero guest stores in ours' 50M parallel run — setter is either a
kernel-side store, computed alias, or probe-quantum-elided JIT store.
Phase B classification: Class 3a (state-divergence on controller object).
The vtable is the same; the controller's bit 28 evolves differently during
sub_822F1AA8 setup. Class 4 (synthesis) is now less attractive since we
correctly reach the dispatcher with the right inputs — we just exit too
soon.
Phase C will need either JIT instrumentation to identify the bit-28 setter,
or a kernel-side hook to clear bit 28 on entry to the loop check site.
Findings notes:
- round-A4b-ours-spawn-gate/FINDINGS.md (spawn topology + tid mapping)
- round-A8-ours-822F1AA8-trace/FINDINGS.md (full loop structure + bit-28 gate)
New reading-error class #18: probe-output anchor misframing (singleton[VA]=X
vtable=Y was misread as "Y is canary-only vtable" when Y is the same
.rdata vtable in both engines).
Branch: iterate-2C/silph-ui-spawn-trace off master @ 229b46c.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,136 @@
|
||||
# Phase A synthesis — canary tid=6 IS the main thread; the wedge is sub_822F1AA8's loop exit
|
||||
|
||||
## Top-line finding
|
||||
|
||||
**Canary's `tid=6` is canary's main thread.** Confirmed by probing `entry_point`
|
||||
(`sub_824AB748`) with `--audit_jit_prolog_pc=0x824AB748`: fires 1× on
|
||||
`tid=00000006` with `lr=BCBCBCBC` (= OS-initial / no caller). Ours numbers
|
||||
its main thread `tid=1`. Same logical thread; different label.
|
||||
|
||||
Therefore "tid=6 fires sub_821741C8 471×" (round 33) means **the main thread**
|
||||
loops inside `sub_822F1AA8` firing `sub_821741C8` ~1678×/30s in canary. In
|
||||
ours, the main thread (tid=1) runs `sub_822F1AA8` ONCE, exits the loop, and
|
||||
proceeds to thread-join on the spawned init thread (handle 0x1070 = tid=13),
|
||||
which is itself blocked forever on handle 0x1078.
|
||||
|
||||
## Call chain (identical in both engines, different runtime behavior)
|
||||
|
||||
```
|
||||
entry_point (sub_824AB748)
|
||||
│
|
||||
├─ sub_824ACB38 CRT-driven fnptr-array iterator (audit-050 region)
|
||||
├─ ...
|
||||
└─ sub_8216EA68 Many local calls including:
|
||||
├─ ExCreateThread(entry=sub_8217F0F8 ...) ; sibling thread
|
||||
├─ sub_822F1AA8(controller=...) ; FIRST call (PC 0x8216ECCC)
|
||||
└─ sub_822F1AA8(controller=0xBCE24A40 canary / ; SECOND call (PC 0x8216EE10)
|
||||
0x40d09a40 ours) ↑ this is the loop
|
||||
```
|
||||
|
||||
The SECOND call is what runs the dispatcher loop. Its LR = 0x8216EE14.
|
||||
Confirmed in both engines.
|
||||
|
||||
## sub_822F1AA8 loop structure
|
||||
|
||||
```
|
||||
0x822F1AA8: entry, r30 = r3 (controller)
|
||||
0x822F1AEC-0x822F1B08: ExCreateThread(entry=sub_822F1EE0, ctx=r30) → r29 = handle
|
||||
0x822F1B30-0x822F1B34: bl 0x824AA8B0(r3=r29) ; ?
|
||||
0x822F1B38-0x822F1B4C: first bctrl → vtable[+0] of [0x828E1F08]
|
||||
0x822F1B50-0x822F1B74: setup, bl 0x824AA330 INFINITE wait on [r22+32]
|
||||
0x822F1B80-0x822F1BA8: post-wait setup; [r30+0] |= 0x2
|
||||
0x822F1BB0-0x822F1BBC: TOP-OF-LOOP CHECK: if [r30+0] & 0x10000000 → goto 0x822F1E10 (exit)
|
||||
0x822F1BCC..0x822F1DEC: loop body (includes the vtable[+8] bctrl → sub_821741C8 at PC 0x822F1D58)
|
||||
0x822F1DEC-0x822F1DFC: bl 0x824AA330 INFINITE wait on [r23+0]
|
||||
0x822F1E00-0x822F1E0C: END-OF-ITERATION CHECK: if [r30+0] & 0x10000000 == 0 → goto 0x822F1BCC (re-loop)
|
||||
0x822F1E10-0x822F1E18: EXIT: [r30+0] |= 0x02000000 (set MSB-6 = LSB-25)
|
||||
0x822F1E1C-0x822F1E24: release something via bl 0x824AA2F0
|
||||
0x822F1E28-0x822F1E30: bl 0x824AA330 INFINITE on [r30+28] = SPAWNED THREAD HANDLE (thread join!)
|
||||
0x822F1E40: bl 0x824AA3E0
|
||||
0x822F1E44-0x822F1E5C: final cleanup: vtable[+24] bctrl on [0x828E1F08]
|
||||
0x822F1E60-0x822F1E78: [r30+0] = 0, then [r30+0] |= 1; bl 0x824567E0
|
||||
0x822F1E7C-0x822F1E88: epilogue
|
||||
```
|
||||
|
||||
**Loop exit gate**: `[r30+0] & 0x10000000` (bit 28 LSB / bit 3 MSB). Set →
|
||||
exit. Both top-of-loop check (0x822F1BBC) and end-of-iteration check
|
||||
(0x822F1E0C) gate on the same bit.
|
||||
|
||||
## What's different between engines
|
||||
|
||||
| Engine | [r30+0] at entry | Loop iterations | Exits sub_822F1AA8? |
|
||||
|--------|------------------|------------------|----------------------|
|
||||
| canary | 0x21 (per probe) | ~1678+ in 30s | NO (stays in loop) |
|
||||
| ours | 0x21 (per probe) | 0 (probes show none of the loop-body PCs fire after entry) | YES (exits quickly) |
|
||||
|
||||
Both engines have `[r30+0]=0x21` at entry — bit 28 NOT set. After the `ori
|
||||
r11, r11, 0x2` at 0x822F1B90, both should have `[r30+0]=0x23`. Bit 28 still
|
||||
not set.
|
||||
|
||||
So **some code sets bit 28 on [r30+0] between sub_822F1AA8 entry and the
|
||||
loop check** in ours but not in canary.
|
||||
|
||||
Mem-watch on 0x40d09a40 (ours' controller VA) shows **zero guest writes** in
|
||||
my 50M-instruction parallel run. Possible reasons:
|
||||
- The setter writes from kernel/runtime code that mem-watch doesn't capture
|
||||
(kernel-host store, not guest JIT store)
|
||||
- The setter writes via a computed alias (different VA but same backing)
|
||||
- The bit IS set via a probe-quantum-elided JIT store
|
||||
|
||||
## Phase B classification
|
||||
|
||||
**Class 3a — state-divergence on the controller object**. The vtable
|
||||
identity is the same (round-37 confirmed `0x820A183C` in both). The
|
||||
controller object's bit 28 of `[+0]` evolves differently during the setup
|
||||
between sub_822F1AA8 entry and the loop check.
|
||||
|
||||
Class 4 (synthesis) is now LESS attractive: ours' main thread DOES reach
|
||||
sub_822F1AA8 with the right controller. We don't need to spawn the
|
||||
dispatcher — we need to PREVENT the main thread from exiting the loop.
|
||||
|
||||
## Pragmatic next step — JIT instrumentation to find bit-28 setter
|
||||
|
||||
Most direct diagnostic: add a JIT hook in xenia-cpu that, for guest stores
|
||||
in the range [0x822F1AA8, 0x822F1E10), captures the guest PC + the written
|
||||
value when the store would set bit 28 of any address. This identifies the
|
||||
exact PC that sets the loop-exit bit.
|
||||
|
||||
Alternative: extend `--mem-watch` to also capture kernel-side stores by
|
||||
hooking the GuestMemory write path at the kernel-state level.
|
||||
|
||||
Even simpler: add a one-shot `--bit-watch=ADDR:MASK` cvar that fires when
|
||||
the value at ADDR has any bit in MASK transition from 0→1, regardless of
|
||||
who wrote it. This is the cleanest diagnostic for this exact pattern.
|
||||
|
||||
## Fix shape (when bit-28 setter is identified)
|
||||
|
||||
If the bit-28 setter is inside the vtable[+0] dispatch chain at 0x822F1B4C
|
||||
(target sub_82173990), then the fix might be a state-init issue in the
|
||||
kernel/runtime.
|
||||
|
||||
If the bit-28 setter is inside the inner wait or one of the kernel calls
|
||||
(`bl 0x824AA8B0`, `bl 0x824AA330`), the fix might be a missing event signal
|
||||
or a wrong handle-state evolution.
|
||||
|
||||
If we can't identify the setter cleanly, the synthesis fallback is to
|
||||
**inject a kernel-side hook that clears bit 28 of [r30+0] on every entry to
|
||||
sub_822F1AA8's bit-check site (0x822F1BB0)**. Crude but should keep the
|
||||
main thread in the loop.
|
||||
|
||||
## Why this is a clearer wedge picture than rounds 22-33
|
||||
|
||||
Rounds 22-33 chased the audit-049 wedge from various angles. The diagnoses
|
||||
landed on different layers:
|
||||
- R22: "wrong cluster targeted" (cluster A vs B)
|
||||
- R26-30: "state-machine progression bug"
|
||||
- R32-33: "pool 3 starvation; bootstrap walk-back"
|
||||
|
||||
This round establishes the simplest possible framing:
|
||||
|
||||
> **Canary's main thread loops forever in a dispatcher; ours' main thread
|
||||
> exits the loop after one setup phase. The exit is gated by a single bit
|
||||
> on the controller's flag word.**
|
||||
|
||||
If bit 28 of `[controller+0]` could be permanently cleared, ours' main
|
||||
thread would stay in the loop, sub_821741C8 would dispatch, signals would
|
||||
flow, tid=13 would complete, draws would happen.
|
||||
@@ -0,0 +1,79 @@
|
||||
AUDIT-PC-PROBE pc=0x822f1aa8 tid=1 hw=0 cycle=6180796 lr=0x8216ee14 r3=0x40d09a40 r11=0x40111910 [r3+0]=0x00000021 [[r3+0]+24]=0x00000000 [r3+0x0C]=0x40541a40 [r3+0x30]=0x00000000
|
||||
AUDIT-PC-PROBE pc=0x822f1b38 tid=1 hw=0 cycle=6181181 lr=0x822f1b38 r3=0x00000001 r11=0x824b0000 [r3+0]=0x00000000 [[r3+0]+24]=0x00000000 [r3+0x0C]=0x00000000 [r3+0x30]=0x00000000
|
||||
|
||||
=== Final State ===
|
||||
PC: 0x824ac578
|
||||
LR: 0x824ac578
|
||||
CTR: 0x82153bf0
|
||||
CR: 0x24000028
|
||||
XER: CA=0 OV=0 SO=0
|
||||
r0 : 0x0000000082153bf0
|
||||
r1 : 0x00000000700ff6e0
|
||||
r2 : 0x0000000020000000
|
||||
r4 : 0x0000000000000001
|
||||
r7 : 0x0000000003a72328
|
||||
r8 : 0x0000000043b77284
|
||||
r9 : 0x0000000043b77328
|
||||
r10: 0x0000000000000001
|
||||
r11: 0x0000000000000103
|
||||
r12: 0x0000000082173c64
|
||||
r13: 0x000000007fff0000
|
||||
r18: 0x0000000040d09a7c
|
||||
r23: 0x00000000828f3844
|
||||
r26: 0x000000004024a4e0
|
||||
r27: 0x00000000820a17a8
|
||||
r31: 0x0000000000001070
|
||||
|
||||
=== Thread diagnostics ===
|
||||
hw=0 idx=0 tid=1 state=Blocked(WaitAny { handles: [4208], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x700ff6e0
|
||||
r0=0x82153bf0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x03a72328
|
||||
r8=0x43b77284 r9=0x43b77328 r10=0x00000001 r11=0x00000103 r12=0x82173c64 r13=0x7fff0000
|
||||
hw=0 idx=1 tid=11 state=Blocked(WaitAny { handles: [2190094916, 2190094880], deadline: None }) pc=0x824d2a94 lr=0x824d2a94 sp=0x71497d90
|
||||
r0=0x00000000 r3=0x00000000 r4=0x71497de0 r5=0x00000001 r6=0x00000003 r7=0x00000001
|
||||
r8=0x00000000 r9=0x00000000 r10=0x71497df0 r11=0x828a3244 r12=0xbcbcbcbc r13=0x4b9f1000
|
||||
hw=1 idx=0 tid=2 state=Blocked(WaitAny { handles: [2189887804], deadline: None }) pc=0x824a95f8 lr=0x824a95f8 sp=0x710ffd20
|
||||
r0=0x0000030c r3=0x00000000 r4=0x00000003 r5=0x00000001 r6=0x00000000 r7=0x00000000
|
||||
r8=0x00000001 r9=0x6f000000 r10=0x824a9178 r11=0x82870000 r12=0x824a94f0 r13=0x4acc3000
|
||||
hw=1 idx=1 tid=13 state=Blocked(WaitAny { handles: [4216], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x715a7a20
|
||||
r0=0x821511d0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x03a723d0
|
||||
r8=0x43b77334 r9=0x43b77334 r10=0x40541f80 r11=0x00000001 r12=0x821cb1e0 r13=0x4d1d4000
|
||||
hw=2 idx=0 tid=7 state=Blocked(WaitAny { handles: [1111821148], deadline: Some(42946672) }) pc=0x824cd4f4 lr=0x824cd4f4 sp=0x71187e60
|
||||
r0=0x00000000 r3=0x00000000 r4=0x00000003 r5=0x00000001 r6=0x00000000 r7=0x71187eb0
|
||||
r8=0x00000000 r9=0x00000000 r10=0x00000002 r11=0x00000002 r12=0xbcbcbcbc r13=0x4b1d6000
|
||||
hw=2 idx=1 tid=8 state=Blocked(WaitAny { handles: [4176, 4132], deadline: None }) pc=0x824ab214 lr=0x824ab214 sp=0x71287c90
|
||||
r0=0x00000000 r3=0x00000000 r4=0x71287cf0 r5=0x00000001 r6=0x00000001 r7=0x00000000
|
||||
r8=0x00000000 r9=0x00009030 r10=0x00000002 r11=0x00000020 r12=0x822f1ff0 r13=0x4b90a000
|
||||
hw=3 idx=0 tid=4 state=Blocked(WaitAny { handles: [4120], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x7112fb80
|
||||
r0=0x821511a0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x03a723d0
|
||||
r8=0x43b7732c r9=0x828f0000 r10=0x00000008 r11=0x00000000 r12=0x8245a660 r13=0x4adc6000
|
||||
hw=3 idx=1 tid=5 state=Blocked(WaitAny { handles: [4224], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x7116fbe0
|
||||
r0=0x821511a0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x03a723d0
|
||||
r8=0x43b7732c r9=0x828f0000 r10=0x00000001 r11=0x00000000 r12=0x82458b34 r13=0x4adc8000
|
||||
hw=4 idx=0 tid=9 state=Ready pc=0x824d1404 lr=0x824d22b4 sp=0x71387df0
|
||||
r0=0x00000000 r3=0x4250dedc r4=0x4250e040 r5=0x00000001 r6=0x00000000 r7=0x00000000
|
||||
r8=0x4b9ec000 r9=0x01010000 r10=0x01010000 r11=0x00000000 r12=0x824d22a8 r13=0x4b9ec000
|
||||
hw=5 idx=0 tid=3 state=Blocked(WaitAny { handles: [4112], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x7111fdf0
|
||||
r0=0x82153bf0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x00000a10
|
||||
r8=0x00000010 r9=0x00000000 r10=0x00009030 r11=0x00000000 r12=0x82181988 r13=0x4adc4000
|
||||
hw=5 idx=1 tid=6 state=Ready pc=0x824ab214 lr=0x824ab214 sp=0x7117fc60
|
||||
r0=0x821511a0 r3=0x00000001 r4=0x7117fcc0 r5=0x00000001 r6=0x00000001 r7=0x00000000
|
||||
r8=0x7117fcb0 r9=0x00009030 r10=0x00000002 r11=0x00000020 r12=0x82458d68 r13=0x4adca000
|
||||
hw=5 idx=2 tid=10 state=Ready pc=0x824d1404 lr=0x824d22b4 sp=0x71487e00
|
||||
r0=0x00000000 r3=0x4250dedc r4=0x4250e040 r5=0x00000001 r6=0x00000000 r7=0x00000000
|
||||
r8=0x4b9ee000 r9=0x01010000 r10=0x01010000 r11=0x00000000 r12=0x824d22a8 r13=0x4b9ee000
|
||||
hw=5 idx=3 tid=12 state=Ready pc=0x824aa6a4 lr=0x824aa6a4 sp=0x714a7da0
|
||||
r0=0x00000000 r3=0x000000ff r4=0x00000020 r5=0x714a7df4 r6=0x00000000 r7=0x00000000
|
||||
r8=0x00000000 r9=0x00000000 r10=0x00000000 r11=0x00000001 r12=0x8217898c r13=0x4d1d2000
|
||||
|
||||
-- Handle waiter lists --
|
||||
handle=0x00001018 Semaphore(0/2147483647) waiters(tid)=[4]
|
||||
handle=0x8287093c Event(sig=false, mr=false) waiters(tid)=[2]
|
||||
handle=0x00001070 Thread(id=13, exit=None) waiters(tid)=[1]
|
||||
handle=0x42450b5c Event(sig=false, mr=true) waiters(tid)=[7]
|
||||
handle=0x00001078 Event(sig=false, mr=false) waiters(tid)=[13]
|
||||
handle=0x00001080 Event(sig=false, mr=false) waiters(tid)=[5]
|
||||
handle=0x828a3244 Event(sig=false, mr=false) waiters(tid)=[11]
|
||||
handle=0x00001024 Semaphore(0/2147483647) waiters(tid)=[8]
|
||||
handle=0x828a3220 Event(sig=false, mr=true) waiters(tid)=[11]
|
||||
handle=0x00001010 Event(sig=false, mr=true) waiters(tid)=[3]
|
||||
handle=0x00001050 Event(sig=false, mr=true) waiters(tid)=[8]
|
||||
Reference in New Issue
Block a user