Keep tool-calling agents on a short leash

When we started, the loudest voices in the room wanted the architecture diagram to look impressive. Kafka, a service mesh, CQRS, a half-dozen microservices before we had a single paying customer. I made an unpopular call: we would ship the most boring thing that could possibly work, and we would not add a component until product evidence made it earn its place.

Four years later that "boring" stack was still the core of the platform — every order, every ledger write, every background job. This is the part nobody tells you up front: restraint is an architecture, and it is far harder to hold than it is to add a queue.

That same lesson applies to agents. The impressive demo is easy: give the model a terminal, a browser, a repo, and a vague goal. The useful system is much smaller. It knows what it is allowed to touch, what it must ask before doing, and how to prove that the answer got better instead of louder.

The agent I trust is not autonomous in the science-fiction sense. It is autonomous in the boring CI sense: scoped, logged, retryable, and easy to stop.

Shrink the tool surface first#

The first security decision is not the prompt. It is the list of tools.

If an agent is reviewing a pull request, it does not need write access. If it is fixing a test, it does not need your deployment secrets. If it is updating dependencies, it should not edit the fetch client, the auth fallback, or anything else you already know is outside the blast radius.

I keep the surface small enough that the tool list itself reads like an API contract:

agent-tools.md

allowed:
 
- read files
- search with rg
- run bun test / lint / typecheck
- apply patch inside claimed files
 
ask first:
 
- delete files
- change migrations
- edit auth, billing, or transport clients
- install packages
 
never:
 
- print secrets
- rewrite generated clients
- mask a failing check

This is not there to make the agent polite. It is there to make the behavior inspectable. When something goes wrong, I want to know whether the contract was too wide, the instruction was ambiguous, or the model ignored the boundary.

Let the repo reject bad work#

Agents produce a lot of text. Repos produce facts.

The useful loop is simple: inspect, patch, run the repo's own checks, read the result, patch again. I do not want a "looks good" summary before the tests run. I want the exact command, the exact result, and a patch that makes the behavior pass for the right reason.

The best agents I have used are almost boring here. They do not celebrate. They run bun test, bun run lint, bun run typecheck, and bun run build, then they report what changed. If they cannot run a gate, they say why. That habit matters more than clever prompt wording.

flowchart LR A[scope] --> B[read] B --> C[patch] C --> D[run checks] D -->|fail| B D -->|pass| E[explain diff] E --> F[human review]

Evals are the harness, not the ceremony#

For code agents, I like small evals that look like real work:

take a failing test and fix only the bug;
update a generated contract without adding legacy aliases;
remove hollow tests without deleting meaningful coverage;
migrate one route pattern and prove every caller moved with it.

Each eval should have a cheap oracle. Did the test fail before and pass after? Did the diff avoid forbidden files? Did it add a compatibility shim when the instruction said "make a clean cut"? If the answer needs a meeting, the eval is too vague.

The transcript is part of the artifact. I want to see where the agent searched, which files it ignored, and what it chose not to change. Silence is where expensive mistakes hide.

The leash is what makes it useful#

The point is not to make the model small. The point is to make the work bounded enough that the model can be useful without becoming a new production surface with no operating contract.

Autonomy without a leash is a demo. Autonomy with a tight tool surface, a real verification loop, and a boring audit trail is an engineering tool. That is the version I will let near a repo.