Anthropic backtracks on policy that 'sabotaged' researchers' work
Key takeaways
- It wasn't a good look for a company that prides itself on working closely with the academic community.
- Primakov/Shutterstock Anthropic is walking back a policy that discreetly hamstrung researchers using its new Claude Fable 5 LLM to create competing AI models, the company told Wired.
- When Anthropic released Claude Fable 5, a new model based on its powerful Mythos system, researchers noted something odd.
It wasn't a good look for a company that prides itself on working closely with the academic community.
Primakov/Shutterstock Anthropic is walking back a policy that discreetly hamstrung researchers using its new Claude Fable 5 LLM to create competing AI models, the company told Wired. "We're changing Fable 5's safeguards for frontier LLM development to make them visible," the company said in a statement. "We made the wrong tradeoff and we apologize for not getting the balance right."
When Anthropic released Claude Fable 5, a new model based on its powerful Mythos system, researchers noted something odd. They found that that Fable 5 would quietly reroute requests to a lesser model when asked to perform certain actions. Moreover, that restriction wasn't disclosed in the model's documentation.