Claude Code's '/goals' separates the agent that works from the one that decides it's done

VentureBeat AI · May 14, 2026, 6:12 PM · Also reported by 3 other sources

Why this matters: a development in AI with implications for how people work, create, and decide.

A code migration agent finishes its run, and the pipeline looks green. But several pieces were never compiled — and it took days to catch. That's not a model failure; that's an agent deciding it was done before it actually was.Many enterprises are now seeing that production AI agent pipelines fail not because of the models’ abilities but because the model behind the agent decides to stop. Several methods to prevent premature task exits are now available from Lang Chain, Google and Open AI, though these often rely on separate evaluation systems. The newest method comes from Anthropic: /goals on Claude Code, which formally separates task execution and task evaluation. Coding agents work in a loop: they read files, run commands, edit code and then check whether the task is done. Claude Code /goals essentially adds a second layer to that loop. After a user defines a goal, Claude will continue to turn by turn, but an evaluator model comes in after every step to review and decide if the goal has been achieved. The two model splitOrchestration platforms from all three vendors identified the same roadblock. But the way they approach these is different. OpenAI leaves the loop alone and lets the model decide when it’s done, but does let users tag on their own evaluators. For LangGraph and Google’s Agent Development Kit, independent evaluation is possible, but requires developers to define the critic node, write up the termination logic and configure observability. Claude Code /goals sets the independent evaluator's default, whether the user wants it to run longer or shorter. Basically, the developer sets the goal completion condition via a prompt. For example, /goal all tests in test/auth pass, and the lint step is clean. Claude Code then runs, and every time the agent attempts to end its work, the evaluation model, which is Haiku by default, will check against the condition loop. If the condition is not met, the agent keeps running. If the condition is met, then

Article preview — originally published by VentureBeat AI. Full story at the source.

Read full story on VentureBeat AI → More top stories

Also covered by

The Verge Microsoft starts canceling Claude Code licenses Hacker News A Claude Code and Codex Skill for Deliberate Skill Development TechCrunch AI Clawdmeter turns your Claude Code usage stats into a tiny desktop dashboard

Aggregated and edited by the Scoop newsroom. We surface news from VentureBeat AI alongside other reporting so you can compare coverage in one place. Editorial policy · Corrections · About Scoop

Claude Code's '/goals' separates the agent that works from the one that decides it's done

Also covered by

More in ai