mixed attractors
Our verification gate places any (prompt, asserts) cell in a four-quadrant matrix on two axes — convergence and correctness. But the LOW-convergence rows hide a critical distinction: when multiple distinct outputs share the same overall pass-rate, are they all valid alternatives, or is the substrate flipping silently between a correct answer and a wrong one? Pure hash-counting can't tell. We added per-attractor correctness checking — for each unique output, did ALL its trials pass asserts? — and split the LOW-convergence row into bounded-valid and mixed-attractors. The mixed-attractors verdict IS the silent flip-flop alert the synthesis-thread asked for.
the takeaway in one paragraph
In our verification gate, when convergence is low and correctness is high, the naive verdict is 'bounded valid attractors' — the model is finding multiple correct algorithms for the same prompt. But that interpretation can be wrong. The model might be flipping between a CORRECT algorithm (most trials, the modal cluster) and an INCORRECT algorithm (a few trials, looks like noise). If the wrong attractor only fires occasionally, the aggregate pass-rate stays high, the convergence stays low, and the gate falsely reports bounded-valid. We close this gap by checking per-attractor correctness: for each distinct output, ALL its trials must pass asserts to count as 'valid.' Any attractor where some trials fail is flagged. The new verdict — mixed attractors — is the silent flip-flop alert that's load-bearing for any system that depends on the gate.
the gap in the 4-quadrant matrix
Our prior verification gate (covered in note 24) used two axes and four quadrants. The LOW/HIGH quadrant (low convergence, high correctness) carried the verdict 'bounded valid attractors' — meaning multiple correct implementations exist. Fine when it's true. But the verdict was inferred from aggregate stats: distinct-hash count and total pass-rate. Neither directly checks whether each distinct shape is correct.
Worked example. Suppose fifteen trials produce two distinct outputs. Output A appears ten times and passes every time. Output B appears five times and fails three out of five times. Aggregate correctness = (10 + 2) / 15 = 80% — just above the HIGH threshold. Convergence = 10/15 = 67% — LOW. Naive verdict: LOW/HIGH bounded-valid. But output B is actually WRONG much of the time. The gate said 'multiple valid implementations'; reality is 'one valid implementation plus one silent flip-flop.'
This is the case that motivated the gate in the first place — the failure mode where the substrate looks reliable in aggregate but harbors a confidently-wrong attractor. Detection from hash-counting alone misses it.
per-attractor correctness
The fix is one extra calculation per cell. For each distinct output (hash), count how many trials produced it AND how many of those trials passed all asserts. If those two numbers are equal — every trial that produced this output passed — the attractor is 'all-pass.' If they differ, the attractor includes at least one failure.
Then aggregate one more number: of the K distinct attractors in the cell, how many are all-pass? Call it K-all-pass. The new verdict logic on the LOW convergence row uses this directly: if K-all-pass equals K, all attractors are reliably correct → bounded-valid; if K-all-pass is less than K, at least one attractor includes failures → mixed-attractors / silent flip-flop.
the 6-verdict grid
| convergence | correctness | attractors | verdict |
|---|---|---|---|
| HIGH | HIGH | single dominant | SHIP · clean |
| HIGH | HIGH | multiple all-pass | SHIP · cosmetic variance |
| HIGH | LOW | — | TRAIN · confidently wrong |
| LOW | HIGH | K all-pass / K total | bounded-valid · accept diversity |
| LOW | HIGH | K all-pass < K total | MIXED ATTRACTORS · silent flip-flop |
| LOW | LOW | — | confused domain |
The four quadrants of the prior matrix split cleanly into six verdicts. The cosmetic-variance verdict (top HIGH/HIGH row, multiple all-pass attractors) and the mixed-attractors verdict (middle LOW/HIGH row) are the new entries. The MIXED ATTRACTORS verdict carries the silent-flip-flop alert and the TRAIN prescription.
the worked example
We ran the new gate on a Fibonacci-function prompt at N=10 trials, single-pass mode. The result: convergence 0.70, correctness 1.0, four distinct attractors, all four 'all-pass.' Verdict: bounded-valid. The four shapes are minor variants — one canonical 11-line implementation with explicit base cases, one variant using else: blocks instead of elif, one with slightly different loop bounds, one with rearranged returns. All four are algorithmically correct. The substrate has found four cosmetic/structural variants of the same correct algorithm. No training needed; the diversity is acceptable.
The same prompt run repeatedly would normally show three or four such attractors. If one of them were wrong — say the model occasionally produced an off-by-one Fibonacci — the gate would now catch it: K-all-pass would be three out of four, and the verdict would flip to MIXED ATTRACTORS. The operator gets an explicit alert that the substrate has a correct AND a wrong load-bearing prior on this prompt class. That's exactly the case downstream systems can't tolerate, and exactly the case that aggregate stats hide.
why this is the load-bearing distinction
Three of the six verdicts say 'no training needed.' Three say 'train.' Of the three TRAIN verdicts, the silent-flip-flop one is the most important to catch because it's the only one that LOOKS LIKE NOT-TRAIN if you only have the aggregate stats. The other two TRAIN verdicts have low aggregate correctness as the smoke signal; this one doesn't.
From a verification-gate perspective: aggregate correctness is necessary but not sufficient. Aggregate convergence is necessary but not sufficient. Per-attractor correctness is the additional check that closes the gap between them. Without it, the gate's LOW/HIGH row gives false-positive 'bounded-valid' verdicts on domains where some attractors are actually wrong.
We added this to our gate as one extra dictionary in the analysis function and a single new conditional in the verdict logic. Maybe fifty lines of code on top of what was already there. Operationally trivial. Conceptually load-bearing.
what we learned
| lesson | why it matters |
|---|---|
| Aggregate stats can hide the dangerous case | High overall pass-rate plus low convergence can be bounded-valid OR silent-flip. The aggregate numbers don't distinguish. |
| Per-attractor correctness is the missing check | Verifying that each distinct output is internally consistent (all trials with that output pass) catches what aggregates miss. |
| The silent-flip-flop verdict has its own prescription | TRAIN — but specifically train against the failing attractor while preserving the passing one. Don't train the model into a single attractor when one of the multiple is correct. |
| The verification gate matures by splitting verdicts | The 4-quadrant matrix was correct as far as it went; the 6-verdict grid is correct further. Each refinement is one new edge case the gate now handles. |