Intuition Labs · φ research · 12 · measured · 17 may 2026

the box is a type, not a thing

we stopped writing flavor.yaml. an agent identity is now a type. invalid combinations cannot be constructed — the compiler is the validator.

1 substrate · 2 identities · 0 flavor files
Box = (orchestrator, techs, skills, termination, ui, persona)  ·  the value is the identity · the type is the contract · invalid identities fail to construct

the takeaway in one paragraph

An agent is just a particular combination of rules, skills, and a wrapper that wires them together. People usually capture that combination in a YAML file and call it a 'flavor.' We tried that for about an hour and realized the YAML is going to drift from the code the day after you write it. Instead, we made the identity a TYPE — a small piece of Python that lists what's in the agent and refuses to construct if the combination is incoherent. Want a coding agent? Construct one type. Want a research agent? Construct another. Want a research agent that uses a single-LLM-call orchestrator? Refuses to compile — incoherent, won't be a thing you can run.

The box stays the same hardware. The substrate stays the same engine. What changes between identities is the type signature. We shipped the abstraction with two identities in the catalog. The coding agent runs exactly as before. The research agent is declared but the substrate-swap that actually flips between them is the next step.

the trouble with flavors

When you have one agent, you hardcode its config. When you have two, you start writing a config file — flavor.yaml or settings.toml or a JSON spec. The file says 'these rules are on, that termination condition fires, this orchestrator is the wrapper.' For a week or two it feels clean. Then the code grows. Someone adds a new rule that depends on the orchestrator. Someone else renames a skill. The YAML doesn't know. The code and the config drift apart. You ship a configuration that says one thing and runs another.

The fix that keeps being rediscovered: don't have two sources of truth. Pick the one you can actually check. We picked the Python value. The identity is a class instance with a strict schema. If you write a combination of fields that doesn't make sense, the class refuses to instantiate. The error message tells you which constraint you violated. There is no separate file to keep in sync because there is no separate file.

what a type does for us

Each identity is a value of a single Box type. The Box has six axes: orchestrator (how it talks to the model), techs (the dispatch rules), skills (the methodology library), termination (what counts as done), ui (how a human interacts with it), persona (the voice). Constructing a Box runs three checks at the moment of construction:

A Box that constructs successfully is a lawful identity by construction. There is no separate 'validate this YAML' step, because the validation IS the construction.

the three ways two rules can combine

Once an identity is a value, the algebra over identities is the algebra over their parts — specifically over their rules. Two rules in the same box can compose in three different shapes, all of them matrix-shaped if you look closely:

compositionwhat it doesmatrix shape
sequentialrule A's output is rule B's inputmatrix multiplication (inner dimension cancels)
parallelboth rules fire independently on the same requesttensor product (dimensions multiply)
gatedrule A's verdict decides whether rule B fires at alla branch through a graph (sum of paths)

The interesting part: each of these can be PROVED correct rather than just tested. The previous post showed how the parallel case proves out for the five pure-logic rules in our coding agent — every pair of them, all the way to the joint decision space, verified by construction. The sequential case is next; we have one obvious example to compile (the model-alias rule, sequenced with the retry-on-length rule). The gated case follows.

What matters for the type system: each pair of rules has a known composition shape, and that shape is the lookup key for the compatibility check. When you construct an identity, the type system walks every pair and asks the table: lawful, regime-dependent, destructive-everywhere? A destructive pair refuses to construct. A regime-dependent pair flows through but carries the caveat. A lawful pair is invisible — that's the point.

what we shipped

The Box type, the catalog with two entries, and a small command-line interface to inspect them — all in one module that didn't exist this morning. The coding agent runs exactly as before. The research agent is declared and validates; switching the substrate over to actually run it is a separate piece of work.

piecewhat it does
Box / Tech / Skill typesthe schema · constructors run validators · invalid combinations refuse to instantiate
catalog (registry)two identities listed as Python values · adding a new one is ~30 lines
interaction tableper-pair compatibility · backed by the construction-frame proofs from the previous note
box CLIlist / show / which / use / validate / dump — operates on the identity layer only · the running stack is untouched

The substrate — the engine, the request proxy, the tech library on disk, the cycle and bench tooling — is unchanged. The new module imports nothing from the running stack and is imported by nothing in it. Deleting the new directory returns the box to today's behavior. The split is real, not nominal.

what we're not claiming

where this came from

Reading the open-source agent frameworks people are publishing made one thing obvious: everyone reaches for a flavor abstraction the moment they have more than one identity. Everyone also drifts. The shape of the fix turns out to be linguistic — the right primitive isn't a file format but a type. Mac Lane wrote down the algebraic structure for typed compositions in the 1940s. Sutton-Precup-Singh restated it for reinforcement learning options in 1999. Linear logic has been pointing at the same shape since Girard in 1987. Programming language type systems have been enforcing exactly these constraints for decades.

We are not inventing. We are using the right tool. The catalog of identities is a few Python values. The composition checks are a few constructor validators. The CLI is a hundred lines. The full system fits in your head. That's the test that matters.