Fine-tuning forgets. RAG leaks context. Hypernetworks build the model your agent needs on demand.
Why this matters: a development in AI with implications for how people work, create, and decide.
Enterprise teams keep watching the same thing happen. An AI agent demos beautifully, goes to production, and stalls: it runs for a short stretch, then needs a human to top up its context and check its output, and the promised efficiency drains into supervision. The agent did the work; you did the watching. It’s one reason so many agent pilots never turn into production systems.The pitch on the other side of that wall is the one every team wants to believe: an agent that runs a long job on its own, overnight if it has to, and leaves a person to validate only the last 10%. Whether that is achievable turns on a problem the orchestration conversation mostly skips. When AI firm Chroma tested 18 leading models, every one lost accuracy as its input grew, a property of how attention works, not a gap a stronger model closes. An agent fed more and more of your business as it runs does not get steadier. It gets shakier.This is the layer beneath the orchestration race. Routing, durable execution and observability all assume each agent is already competent enough to coordinate in the first place. The deeper question is how long an agent can run before a human has to step in, and that comes down to where your company's knowledge lives relative to the model. Both standard fixes leave a human in the loop.Why teaching a model your business keeps you in the loopFrontier models keep getting more capable, and the gap does not close, because it is not a capability problem. It is about where your knowledge sits relative to the model, and enterprises have had two ways to place it there. The first is fine-tuning, which bakes knowledge into the weights. It remains subject to catastrophic forgetting, a problem identified in the 1980s and still unresolved in 2026: teaching a model something new tends to erode what it already knew. Teams work around it by isolating each task in its own fine-tuned model or adapter, which produces a sprawling estate of models that raises cost and governance