Agents Need Boring Tools
Most of the reliability work on agent systems I’ve watched up close is not prompt engineering. It is tool design. A model will happily use a badly specified tool in a subtly wrong way, and you will only notice when a downstream step silently fails.
The fix is unsexy: fewer tools, narrower arguments, error messages written for a reader who cannot see the full system, and a strict separation between tools that observe and tools that act.
This post is a collection of the rules of thumb I now apply before a new tool goes into an agent’s toolbox.