back to reflections

Loops, Cycles, and Runaways

Loops, cycles, and runaways are the default failure modes of any agent that acts in steps. Every framework in the open-source community runs into them, and the difference between a demo and a product is whether you detect them, stop them, and surface them instead of letting an agent quietly burn time and money.
Petko D. Petkovon a break from CISO duties, building cbk.ai

If you point an agent at a task and give it a few tools, sooner or later it will get stuck. It will call the same tool, get the same answer, and call it back again. Or it will start repeating itself mid-response and never find the exit. Or it will just keep taking one more step, then one more, never deciding it is finished. And so on.

These are the shapes I run into most. A loop, where the model repeats an action and expects a different result. A cycle, where it bounces around a small set of actions that never move things forward. A runaway, where a single response collapses into the same phrase again and again. They look different on the surface and share the same root. The model left to its own devices will do it forever.

Everyone has this problem. Go spend ten minutes in the issue tracker of any popular agent framework/harness and you will find it. The most well-known coding agents in the world do it too. Nobody wrote bad code to cause it. It is simply what a system does when it acts in steps with no sense of progress.

What separates a demo from a product is whether you treat that as acceptable. In a demo a loop is a funny screenshot. In production it is a stuck conversation, a frustrated user, and a cost line nobody can explain. The agent looks like it is working, because technically it is. It is just working on nothing.

So we take it seriously, which mostly means doing unglamorous things well, extremely well. We must notice when the same call keeps returning the same result and stop, or catch a response that has collapsed into repetition before it drains the budget. We must put a ceiling on how many steps a single turn can take and when any of those guards trip, surface it plainly instead of swallowing it, so we can see exactly what the agent was doing when it wedged itself.

Yes, it is the quiet work that decides whether an agent survives five minutes or survives real users. We would rather spend our time here than pretend the failure modes are someone else's problem.

That is the kind of reliability we build into CBK, so the agents you put in front of real people keep moving forward instead of spinning in place.