mixed attractors

Desai, Tej

Intuition Labs · φ research · 25 · measured · 19 may 2026

mixed attractors

Our verification gate places any (prompt, asserts) cell in a four-quadrant matrix on two axes — convergence and correctness. But the LOW-convergence rows hide a critical distinction: when multiple distinct outputs share the same overall pass-rate, are they all valid alternatives, or is the substrate flipping silently between a correct answer and a wrong one? Pure hash-counting can't tell. We added per-attractor correctness checking — for each unique output, did ALL its trials pass asserts? — and split the LOW-convergence row into bounded-valid and mixed-attractors. The mixed-attractors verdict IS the silent flip-flop alert the synthesis-thread asked for.

4-quadrant matrix → 6-verdict grid · silent flip-flop detection

verdict = quadrant(convergence, correctness, k_all_pass / k_attractors) · per-attractor correctness disambiguates bounded-valid (all attractors pass) from silent-flip (some attractors fail)

the takeaway in one paragraph

In our verification gate, when convergence is low and correctness is high, the naive verdict is 'bounded valid attractors' — the model is finding multiple correct algorithms for the same prompt. But that interpretation can be wrong. The model might be flipping between a CORRECT algorithm (most trials, the modal cluster) and an INCORRECT algorithm (a few trials, looks like noise). If the wrong attractor only fires occasionally, the aggregate pass-rate stays high, the convergence stays low, and the gate falsely reports bounded-valid. We close this gap by checking per-attractor correctness: for each distinct output, ALL its trials must pass asserts to count as 'valid.' Any attractor where some trials fail is flagged. The new verdict — mixed attractors — is the silent flip-flop alert that's load-bearing for any system that depends on the gate.

the gap in the 4-quadrant matrix

Our prior verification gate (covered in note 24) used two axes and four quadrants. The LOW/HIGH quadrant (low convergence, high correctness) carried the verdict 'bounded valid attractors' — meaning multiple correct implementations exist. Fine when it's true. But the verdict was inferred from aggregate stats: distinct-hash count and total pass-rate. Neither directly checks whether each distinct shape is correct.

Worked example. Suppose fifteen trials produce two distinct outputs. Output A appears ten times and passes every time. Output B appears five times and fails three out of five times. Aggregate correctness = (10 + 2) / 15 = 80% — just above the HIGH threshold. Convergence = 10/15 = 67% — LOW. Naive verdict: LOW/HIGH bounded-valid. But output B is actually WRONG much of the time. The gate said 'multiple valid implementations'; reality is 'one valid implementation plus one silent flip-flop.'

This is the case that motivated the gate in the first place — the failure mode where the substrate looks reliable in aggregate but harbors a confidently-wrong attractor. Detection from hash-counting alone misses it.

per-attractor correctness

The fix is one extra calculation per cell. For each distinct output (hash), count how many trials produced it AND how many of those trials passed all asserts. If those two numbers are equal — every trial that produced this output passed — the attractor is 'all-pass.' If they differ, the attractor includes at least one failure.

Then aggregate one more number: of the K distinct attractors in the cell, how many are all-pass? Call it K-all-pass. The new verdict logic on the LOW convergence row uses this directly: if K-all-pass equals K, all attractors are reliably correct → bounded-valid; if K-all-pass is less than K, at least one attractor includes failures → mixed-attractors / silent flip-flop.

the 6-verdict grid

convergence	correctness	attractors	verdict
HIGH	HIGH	single dominant	SHIP · clean
HIGH	HIGH	multiple all-pass	SHIP · cosmetic variance
HIGH	LOW	—	TRAIN · confidently wrong
LOW	HIGH	K all-pass / K total	bounded-valid · accept diversity
LOW	HIGH	K all-pass < K total	MIXED ATTRACTORS · silent flip-flop
LOW	LOW	—	confused domain

The four quadrants of the prior matrix split cleanly into six verdicts. The cosmetic-variance verdict (top HIGH/HIGH row, multiple all-pass attractors) and the mixed-attractors verdict (middle LOW/HIGH row) are the new entries. The MIXED ATTRACTORS verdict carries the silent-flip-flop alert and the TRAIN prescription.

the worked example

We ran the new gate on a Fibonacci-function prompt at N=10 trials, single-pass mode. The result: convergence 0.70, correctness 1.0, four distinct attractors, all four 'all-pass.' Verdict: bounded-valid. The four shapes are minor variants — one canonical 11-line implementation with explicit base cases, one variant using else: blocks instead of elif, one with slightly different loop bounds, one with rearranged returns. All four are algorithmically correct. The substrate has found four cosmetic/structural variants of the same correct algorithm. No training needed; the diversity is acceptable.

The same prompt run repeatedly would normally show three or four such attractors. If one of them were wrong — say the model occasionally produced an off-by-one Fibonacci — the gate would now catch it: K-all-pass would be three out of four, and the verdict would flip to MIXED ATTRACTORS. The operator gets an explicit alert that the substrate has a correct AND a wrong load-bearing prior on this prompt class. That's exactly the case downstream systems can't tolerate, and exactly the case that aggregate stats hide.

why this is the load-bearing distinction

Three of the six verdicts say 'no training needed.' Three say 'train.' Of the three TRAIN verdicts, the silent-flip-flop one is the most important to catch because it's the only one that LOOKS LIKE NOT-TRAIN if you only have the aggregate stats. The other two TRAIN verdicts have low aggregate correctness as the smoke signal; this one doesn't.

From a verification-gate perspective: aggregate correctness is necessary but not sufficient. Aggregate convergence is necessary but not sufficient. Per-attractor correctness is the additional check that closes the gap between them. Without it, the gate's LOW/HIGH row gives false-positive 'bounded-valid' verdicts on domains where some attractors are actually wrong.

We added this to our gate as one extra dictionary in the analysis function and a single new conditional in the verdict logic. Maybe fifty lines of code on top of what was already there. Operationally trivial. Conceptually load-bearing.

what we learned

lesson	why it matters
Aggregate stats can hide the dangerous case	High overall pass-rate plus low convergence can be bounded-valid OR silent-flip. The aggregate numbers don't distinguish.
Per-attractor correctness is the missing check	Verifying that each distinct output is internally consistent (all trials with that output pass) catches what aggregates miss.
The silent-flip-flop verdict has its own prescription	TRAIN — but specifically train against the failing attractor while preserving the passing one. Don't train the model into a single attractor when one of the multiple is correct.
The verification gate matures by splitting verdicts	The 4-quadrant matrix was correct as far as it went; the 6-verdict grid is correct further. Each refinement is one new edge case the gate now handles.

mixed attractors

the takeaway in one paragraph

the gap in the 4-quadrant matrix

per-attractor correctness

the 6-verdict grid

the worked example

why this is the load-bearing distinction

what we learned

links