techs are options
wavedash, MTGO infinite-combo, the right tennis grip — the gaming word for what cognitive science calls an option. the concept landed. the proposed self-discovery extension hit its own pre-registered falsifier.
the problem
A coding agent that picks the right model and the right prompt is doing strategy. Strategy is the play — what you want the system to do. It is necessary and not sufficient.
Paraphrased from a user message this week, on MTGO: in paper Magic an infinite combo wins by you showing the cards and your opponent scooping. Online, the engine does not care about strategy. If you do not know to hold CTRL to keep priority and auto-yield to your own triggers, you time out and lose. The model knows the winning play. The tech is the specific click sequence that forces the engine to execute it.
Same split shows up everywhere: Overwatch wavedash, Smash short-hop, the right tennis grip for your body, Magnus playing ten games in his head. Strategy is what to do. The tech is how to make this substrate let you do it.
the word
Reinforcement learning has a name for this object. Sutton, Precup, and Singh (1999) call it an option: an initiation set (when can it fire), an internal policy (what it does once it fires), and a termination condition (when it stops). Invoked as a single primitive in a hierarchy.
That is exactly what a wavedash is. Exactly what an MTGO click sequence is. Exactly what a tennis grip is. The gaming word "tech" and the AI word "option" point at the same object — temporally extended, composable, learnable separately from the surrounding strategy.
The codebox keeps "tech" because the gaming heritage reads cleaner to operators. The formal mapping appears once: a tech is an option. Then we go back to the gaming word.
state compression
Magnus plays ten games in his head not because he remembers more, but because he remembers more complete states at once. Chase and Simon proved this with chess masters in 1973 — they reconstructed real positions far better than novices, then lost the advantage on randomized boards. The expert is not storing more pieces; the expert is storing patterned clusters.
A tech library is the same thing. Each tech is a learned pattern that, when matched against the current request, compresses a long chain of micro-decisions into one invocation. The codebox stores hundreds of these and queries them in one parallel pass at decision time.
two-frame storage
Every tech in the library ships two files:
techs/<name>/
├── tech.yaml ← the option as category. The play. Travels.
├── runtime.yaml ← the option as executed here. The click sequence
│ for this engine, this proxy, this model.
└── edges_proven ← the local evidence: which prompts, what changed.The MTGO strategy/tech split lives in this file pair. tech.yaml carries the play; runtime.yaml carries the substrate-specific mechanic that forces execution. They are the same option in two frames; the proxy applies the coordinate transformation between them at the moment of action.
measurable confidence
Which tech fires depends on the prompt. Some techs fire confidently; some fire with measurable hesitation. The phi signal — same one that halts in note 01 and rates in note 02 — gives a per-call confidence band the cloud indexes against.
In practice the codebox does not just have a library of techs; it has a library of techs each carrying a distribution over confidence by prompt class. The right tech for this player, this game, this moment is a query against that cloud, not a guess. The error bars are exact.
the open question
The framework today can apply techs, compose them, measure them, and commit them. It cannot yet *discover* them. A human reads the point cloud, hypothesizes a candidate, ships it inert, benches it. The discovery step is hand-cranked.
Recursive self-improvement closes only when the box runs the discovery step itself. The next planned build is a tech-mining verb that clusters the point cloud, surfaces spots where a one-rule-different neighbor outperforms the local best, and writes a candidate tech automatically. Pre-registered falsifier: if two iterations produce zero useful new techs, the box stays human-in-the-loop, and we say so.
what landed, what didn't
The concept landed. Every tech in the library is now a typed object — the codebox-side type system carries the option triple directly (precondition, dispatch policy, falsifiers), and the agent's identity is a typed instantiation that lists which techs are part of it and rejects invalid combinations at construction. The MTGO / Overwatch / tennis-grip metaphor maps cleanly onto an algebra over those typed objects. The framing held up.
The discovery primitive did not. The plan was to read the cloud of past requests, surface places where a one-rule-different neighbor outperformed the local best, and emit a candidate tech automatically. We ran two versions of that algorithm. Both produced candidates; both candidates failed their paired benches. Per the pre-registered rule stated above — if two iterations produce zero useful new techs, the box stays human-in-the-loop, and we say so — the discovery primitive does not work as designed. Adding new techs continues to be a human-authored step.
The split makes sense in retrospect. Naming an object and authoring members of that object class are different problems. Options theory tells us what a tech IS. It doesn't tell us where new ones come from. That part is still where human intuition lives.
what we are not claiming
- That every tech is deterministic. Sampler variance is real. The schema now carries a solvability field — deterministic / stochastic / open — so the library is honest about which is which.
- That a future discovery primitive can't work. Two attempts at the specific algorithm we pre-registered failed. A different algorithm might succeed. We are not claiming the door is closed; only that the specific door we tried is.
- That options theory predicts which techs to write. It names the object; it does not author the option set.
where this came from
A casual message comparing MTGO play to Overwatch wavedash to tennis grips. The connection to the formal AI/cog-sci literature came after the gaming metaphor — and the metaphor turned out to be the cleaner mental model.
Three primary citations carry the framing. Sutton, Precup, Singh (1999) — the AI primary on options. Chase and Simon (1973) — the cognitive-science primary on chunking. Chollet (2019) — intelligence as skill-acquisition efficiency, which gives the optimization target. Anderson ACT-R covers the mechanism by which a procedure becomes "second nature." Klein recognition-primed decision covers the deployment-side intuition.