Scoopfeeds — Intelligent news, curated.
computer-science

Anthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts

TechCrunch · May 10, 2026, 8:40 PM · Also reported by 1 other source

Key takeaways

  • Fictional portrayals of artificial intelligence can have a real effect on AI models, according to Anthropic.
  • Last year, the company said that during pre-release tests involving a fictional company, Claude Opus 4 would often try to blackmail engineers to avoid being replaced by another system.
  • Apparently Anthropic has done more work around that behavior, claiming in a post on X, “We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation.”

Fictional portrayals of artificial intelligence can have a real effect on AI models, according to Anthropic.

Last year, the company said that during pre-release tests involving a fictional company, Claude Opus 4 would often try to blackmail engineers to avoid being replaced by another system. Anthropic later published research suggesting that models from other companies had similar issues with “agentic misalignment.”

Apparently Anthropic has done more work around that behavior, claiming in a post on X, “We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation.”

Article preview — originally published by TechCrunch. Full story at the source.
Read full story on TechCrunch → More top stories

Also covered by

Aggregated and edited by the Scoop newsroom. We surface news from TechCrunch alongside other reporting so you can compare coverage in one place. Editorial policy · Corrections · About Scoop