Agents Need Boring Tools

0 claps

Most of the reliability work on agent systems I’ve watched up close is not prompt engineering. It is tool design. A model will happily use a badly specified tool in a subtly wrong way, and you will only notice when a downstream step silently fails.

The fix is unsexy: fewer tools, narrower arguments, error messages written for a reader who cannot see the full system, and a strict separation between tools that observe and tools that act.

This post is a collection of the rules of thumb I now apply before a new tool goes into an agent’s toolbox.

More posts

Building an Eval Loop You Actually Trust

Get the newsletter