agentic-ai

Is Claude Mythos the most Dishonest or Does the System Card Have Errors

LessWrong · Jun 23, 2026, 2:57 AM

I was reading through the Claude Mythos Preview System Card and found this nice plot on page 97. The caption on this plot says "Dishonesty rate" and Claude Mythos Preview scores the highest at 80.0%. The title and context suggest the plot might be honesty rate instead. An honest mistake? Screenshot of Page 97 https://www.anthropic.com/claude-mythos-preview-system-card Double Checking Work with AI Should be the Norm?As a quick check, I uploaded the PDF to claude.ai and asked Opus 4.6 if there are any issues in the plots in the file. The first thing it flagged was this plot. Personally, I've found using AI to double check human work one of the most clear value-add cases for current AI models. Maybe this is just an isolated incident, but are others not doing the same? I know false positives can be an issue, sorting through a laundry list of supposed LLM flagged issues can be a chore. But asking for a ranking and checking even the top three suggestions has provided value in multiple projects for me, both personal and professional.Are There More Issues?Scrolling further, we find this nice plot on hallucination rates. Here the caption suggests Mythos Preview hallucinates the most, and by a pretty wide margin too. Again the title and context seems to suggest the opposite is true. This plot was also flagged in my generic request to Opus 4.6 to "find any issues in the plots in this file."Screenshot of Page 99https://www.anthropic.com/claude-mythos-preview-system-cardAll in all, I suspect this is just a labeling issue. But it is worth double checking the code and benchmarks that generated these plots. My greater worry is other people (and AI models for that matter) index pretty heavily on plots such as these. I really want to make sure future AI's don't train on this data and come to a false conclusion about issues in Claude models. I know investment decisions, safety research directions, and general world modeling regularly incorporate key findings like this. And uncovering i

Article preview — originally published by LessWrong. Full story at the source.

Read full story on LessWrong → More top stories

Aggregated and edited by the Scoop newsroom. We surface news from LessWrong alongside other reporting so you can compare coverage in one place. Editorial policy · Corrections · About Scoop

Is Claude Mythos the most Dishonest or Does the System Card Have Errors

More in agentic-ai