The White House Wants Anthropic to Block All Jailbreaks. That May Not Be Possible
Key takeaways
- Anthropic has said for days that the administration’s concerns are overblown and that the effects of the jailbreaks are minimal.
- At this stage, the administration essentially views the situation as Anthropic’s problem to fix, according to three people familiar with discussions.
- But on a more fundamental level, it remains unclear how Anthropic is supposed to prevent jailbreaking.
Why this matters: a development in AI with implications for how people work, create, and decide.
Photo-illustration: WIRED Staff; Getty Images Comment Loader Save Story Save this story Comment Loader Save Story Save this story The Trump administration’s disagreement with Anthropic over its most advanced AI models appears to be fast coming to a head.
Trump officials tell Inner Loop that if Anthropic wants to rerelease Claude Fable 5, the AI model that they took offline with export controls last week over concerns about jailbreaking—a method of using prompts to get around a model’s safeguards—the company will need to take steps to actually address what the government alleges are vulnerabilities.
Anthropic has said for days that the administration’s concerns are overblown and that the effects of the jailbreaks are minimal. It reiterated this position to the Commerce Department and the Office of the National Cyber Director, Sean Cairncross, in a technical meeting on Monday.