What's Continual Learning, and Why Might We Expect To See It In Advanced LLM Agents?
Summary We say that an agent is a continual learner if it undergoes persistent updates during deployment. That’s more-or-less a binary criterion, but there are several other components to being good at continual learning that are much more continuous. We say an agent is an effective continual learner to the extent that it:Constantly undergoes persistent updates during deployment,Learns new useful knowledge and capabilities efficiently via those updates, and Does not (catastrophically) forget existing capabilities in the process.CL lies on a spectrum, major capability advances may not require CL breakthroughs, and early forms already exist (e.g., agentic RAG, CLAUDE.md, SKILL.md, and personalization prompts).The basic reason to expect effective CL is that it would probably make AI agents better at important tasks on which AI companies are trying to improve performance, most notably AI research itself. So far, nothing has allowed LLM agents to become as good at end-to-end research as capable humans become after years of practice, despite the fact that LLM agents collectively accumulate research experience much faster than individual humans. This argument applies to most open-ended remote labor jobs. CL is also closely tied to sample efficiency on long-horizon and hard-to-verify tasks.The main components of an LLM agent that can receive persistent updates during deployment are model weights, the context window, memory banks with natural language or neural activation memories, the agent scaffold, and tools. We expect different update mechanisms will suit different types of CL, so a mixture is likely. Weight updates are probably needed for some parts of effective CL since LLMs seem quite bad at handling lots of interrelated complexity in their context window. But naïve weight updates often degrade existing capabilities, which is why frontier systems haven't widely adopted them yet.Constantly implementing better and better LLM agent post-deployment update mechanisms is a si