the method
One design move, five patterns.
Read top to bottom and a single move recurs, applied to a different layer each time: convert a safety property from a rule that can be talked past into a capability the system simply does not have. Friction is paid up front, on purpose, so a class of failure becomes impossible downstream.
Foundations
structure first, enforced mechanically
The thesis, stated first
The whole practice rests on one conviction. A safety property written as a rule is a soft guard: a future prompt can talk past it. A safety property enforced by what a tool can and cannot do is a hard guard: no prompt can route around a capability that was never granted. Wherever the cost of a mistake is high, the system chooses the hard guard. It does not ask an agent to behave; it arranges things so the unsafe action is not available.
Convert a safety property from a written policy into a structural impossibility at the capability layer. That single move, repeated, is the design philosophy this site documents.
The starting condition
A continuous pipeline already turned raw inbound material into distilled insight notes. The next move was to convert those insights into durable, machine readable configuration for an AI coding agent: reusable capabilities, behavioral rules, specialist roles, and settings. The goal was stated up front, and it set the bar for everything after: the result had to be a structure an agent could walk into cold and act on correctly without guessing.
Four decisions that set the shape
The configuration directory is the home, not a sync target. Keeping working notes in a separate store that synced into the agent's config directory was rejected: a sync layer introduces drift, and the rules an agent reads have to live exactly where it reads them. The config directory itself became the version controlled, human openable vault.
Logical hierarchy, flat filesystem. The instinct was to nest roles by division on disk. Testing proved the coding agent does not look inside nested role folders, so a nested role would be silently invisible. The fix: storage stays flat, division is a metadata field, and the org chart is generated from those fields.
A two way ownership contract prevents bloat. Every specialist capability declares which roles may use it; every role declares which capabilities it carries. The two lists must agree, and a validator fails the build on any mismatch. Orphaned capabilities are forbidden. Bloat is not discouraged in prose; it is made to fail a check.
Numeric bloat caps. Up to six capabilities on a role passes, seven to nine warns, ten or more fails outright until the role is split or pruned. The concern stopped being a vibe and became a threshold.
What this established
By the end of the founding work the skeleton was in place and green against its own validator: a handful of specialist roles across a few divisions, the first behavioral rules and their index, the first decision records, and the validator that parses metadata, enforces the contract and the caps, and regenerates the org chart deterministically. The throughline is the thesis: structure first, enforced mechanically, so everything built on top inherits consistency instead of re deriving it.
The Substrate
two layers of memory
The problem this solves
As the system grew, the same failure kept recurring: each new session re derived the same process decisions from scratch, and sometimes got them wrong. An early idea, a role that would carry a whole prior transcript forward into each session, was rejected because transporting whole transcripts forces unbounded context and does not scale. The better answer was to refine the shared substrate itself, so every agent starts from a consistent way of working and most context transport becomes unnecessary. That split the substrate into two layers.
Layer one: load on demand rules
Behavioral rules encode procedure: when you are doing X, follow Y. They are not loaded every session, because that would burn context for no reason. Each rule declares a trigger, an index lists every rule with its trigger, and a rule's body loads only when the current task matches. A standing requirement falls out of this: triggers must be disjoint, since two overlapping triggers both load when only one applies.
The rules cover the recurring hazards, among them a documentation quality rule, a discovery conventions rule that captures each silent failure mode as a worked anti example, a migration narrow patcher rule, and a placement rule that decides which tier owns any given output.
Layer two: the always loaded principles
Underneath the trigger gated procedures sits a short, always loaded posture document: the tenets. Where rules encode how to do a category of work, the tenets encode the why that every role shares. They were distilled from the existing rules and decisions, a consolidation of values already implicit, not an invention.
Verify before guessing. Documentation is part of the deliverable. Narrow scope, refuse expansion. Conservative by default. Refuse and surface beats attempt with uncertainty. The human is the trust boundary for irreversible actions.
That last principle, the human on the irreversible step, becomes the spine of everything that follows.
How the two layers relate
The division of labor is deliberate. Procedure relevant only in a specific context belongs in a trigger gated rule, so the always loaded surface stays thin. Posture that should color every action belongs in the tenets. Cross references thread through the whole substrate, so a session starting from any entry point discovers the rest. The principles evolve only when a choice is settled, and a load bearing change to them goes through a formal decision record with rationale and alternatives.
Meta-Learning
it learns, but cannot promote itself
The gap this closes
By this point the system had a multi layer memory and a framework for routing any output to the right tier. What it lacked was a way to notice patterns across sessions and turn a recurring observation into a durable improvement. The learning loop had been deferred on purpose, with a precondition: let the lower layers run for at least five substantive sessions first, so the thresholds could be calibrated against real observations rather than guessed. When that precondition was met, the loop was built.
The two pieces
An append only observation ledger. One file accumulates one line per noticed pattern, each line carrying a small fixed schema: a recurrence key for deduplication, a type, a target tier, a scope, a status, and a short note. Capture piggybacks on the existing session journal trigger, so no new ritual is added. The content rule is strict: no sensitive data ever enters the ledger.
A curator that proposes, and only proposes. A dedicated role reads the ledger, groups lines by recurrence key, counts each group against a flat threshold, and for any group at threshold emits a structured text proposal. For groups below threshold it simply reports the running count.
The load bearing decision: the bypass is tool impossible
The most important decision here is how the human stays in control. A curator that could write to the rules or the tenets would be exactly the design bug the seventh tenet warns about. So the curator was given a deliberately minimal toolset: it can read and search, and nothing else. No write, no edit, no shell, no ability to dispatch another role.
A written instruction that the curator should not write is a soft guard a prompt could talk past. An absent capability is a hard guard no prompt can route around. The human gate is enforced by the toolset itself, not by policy text.
The full path therefore has three steps by design: the ledger accumulates, the curator proposes inert text, a human approves, and a separate narrow scope writer persists the change. The curator occupies only the first step.
The friction is the feature
Promotion takes three steps instead of one, and the curator cannot fix even an obviously correct proposal on its own. That is intentional. The friction is the gate, and "obviously correct" is exactly the judgment the seventh tenet reserves for the human. The design is self protecting: any session tempted to grant the curator write access "to save a step" trips a trigger that surfaces this decision first. This is the system's signature move applied to its own evolution.
Assisted Autonomy
automate the reversible, gate the irreversible
Where the story arrives
One milestone is about friction, not new capability. By this point several agent teams ran across the projects plus the global configuration, and day to day work carried avoidable overhead: every cross project task had to be hand assembled as a multi step invocation, a couple of single specialist projects had no orchestrator and were unroutable from the top, and several teams had not yet moved into the shared central hold. The goal was lower friction, without crossing the two constraints the whole system is built around: the human as trust boundary, and a standing rule to stay on subscription auth and never route work through a raw metered API key.
The posture was named precisely: assisted one touch. One command per cross project task, approvals on, and no unattended loops or schedules running live.
The friction work
- Commit only auto save. The automatic push was removed, leaving commit only. A commit is locally reversible; a push is not. An unattended push is an irreversible action with no human in the loop, so it belongs behind an explicit human step. This single change is what keeps auto save compliant with the gate.
- Thin orchestrators for the orphaned projects, so the whole fleet follows the uniform "orchestrator plus specialists" shape and every project is routable from the top.
- A one touch front door. A single command line entry point, adversarially hardened: the request is passed as one argument so injection vectors are inert, it aborts if a raw API key is present, it resolves scope from a static table never derived from input, and it hard wires the granular approve edits mode and refuses blanket bypass. It is one touch, never zero touch.
The line that will not be crossed
The durable artifact is a decision rule that picks an automation primitive by intent, measured against two hard constraints: no raw API key, and a human gate on every irreversible action. Its most pointed entry is a named rejection.
An unattended cloud scheduled routine for irreversible work is rejected outright: it would run with no approval prompts, in a fresh clone carrying none of this system's substrate, and pull toward a raw metered key. That single row encodes the line convenience does not get to cross.
The gate, demonstrated live
The work ended on the gate working in real time, not in theory: every commit the migration produced was left local and unpushed, pending a deliberate review and push. That is the whole arc in miniature. Friction driven down to a single command, while the one action that cannot be undone still waits for a person to say yes.
Publishing Safely
the thesis applied to this very page
The framing
This site narrates a private engineering history without exposing any of it. That is not a promise made in prose; it is a property built into how the site is produced. The publishing machinery is the clearest demonstration of the central thesis, because the stakes are concrete and irreversible: one leaked identifier on a public page cannot be unsaid. So the unsafe action was not labeled "be careful." The pipeline was arranged so the unsafe action is not available.
The single reader
Exactly one role reads raw private sources. That role is the sole reader of the raw history and of the private real to alias map, and it writes only sanitized, alias only drafts into a staging area. Everything downstream reads only the human approved summaries, never the raw sources and never the map.
The boundary is enforced at the tool layer, not by instruction. The reader is granted read, search, and a write restricted to staging. It is denied a shell, denied the ability to dispatch another role, denied in place edit, and denied any network. The boundary is tool impossible to cross.
And it fails closed: when a source names a component with no alias in the map, it emits nothing for that component, never inventing or leaking the real name. A missing mapping is a publish blocker, not a pass through.
Defense in depth
Three more mechanisms back the seam:
- A default deny gate. A whole vault leak gate treats a new artifact as forbidden until it is explicitly classified safe. Default deny is itself the thesis: the safe state is the one that needs a positive decision to relax, not the one that needs remembering to lock.
- A build time denylist. The forbidden term list is sourced from the private map and is backed by independent structural patterns (absolute paths, version control URLs, personal email forms), because no hand maintained list can be proven complete.
- An independent pre publish scan. Two independent mechanisms actually caught real names a single check missed, which is the validation of the layered design. Go live itself stays a human gate; no agent deploys or pushes.
The same posture, extended
The approach governed two recent additions too. An external design skill was brought in as one pinned, security patched, single copy, vendored through a fail closed checklist rather than forked per project or faked as first party. And a clarifier command was added that inspects a draft, surfaces its gaps, and proposes a sharper version: it never rewrites silently and never executes, and the opt in is enforced structurally, so the model cannot auto invoke it.
The point
Every mechanism here is the thesis applied to the riskiest surface the system has, its own public output. A single reader that structurally cannot leak. A gate that denies by default. A denylist backed by independent patterns. A second scan that already earned its place. None of it is advisory. The safety lives in what the tools can and cannot do, and the one action that cannot be undone, going live, still waits for a person to say yes.