Prompt Caching Changed My Agent Architecture
Once the prefix is nearly free, the cheap move and the smart move stop being the same thing.
Where this goes
The human moved up a level. You no longer write the fix; you specify the goal and judge the result. The scarce skill now is a crisp success criterion and an honest review of the diff.
An agent is only as good as the loop it runs. Perception, plan, act, verify — the failure usually hides in the verify step, where the model that wrote the code also grades it. Independent checks are not optional; they are the architecture.
Context is the scarce resource. The teams shipping real agentic work treat the context window like a budget: what goes in, what gets summarized, what gets dropped. Most agent failures are context failures wearing a different costume.