agentic-ai

Third-parties should focus on scrutinising system cards

LessWrong · Jun 29, 2026, 12:28 AM

By default, I expect system cards will get worse, which would be bad. Some mechanisms could improve system cards, but I expect they will be outweighed. In any case, I think third-parties should focus on scrutinising system cards — this seems like a great activity for outsiders in the current strategic landscape. I'll sketch what that could look like, and offer some recommendations.It would be bad if system cards degraded.If labs felt pressure to evaluate the risks accurately, they'd be better incentivised to reduce them.If the risks were high enough, and a lab communicated that, then this might prompt drastic government action.It's very plausible that, if labs build misaligned AIs that take over, then most of the employees had a genuine but incorrect belief that the AIs wouldn't take over, based on evidence that was actually flimsy and misleading. Outsiders scrutinising system cards seems like a great mechanism for improving the epistemics within the labs.It's good for the outside community to know which risks are most pressing, so they know which activities to prioritise.By default, I expect system cards to get worse, because…The system will get genuinely more complicated.There will be more models, trained with more techniques, interacting in more ways — plus dozens of ad-hoc patches for the problems that arise.Already no one person holds the whole system in their head, and the fraction that fits in one head will keep shrinking.Any overall safety judgment will come from loosely aggregating many lines of evidence, rather than from a clean structured safety case.I expect the cards themselves to get sloppier.They'll be rushed, written in less wall-clock time, because the pace of AI progress will accelerate.They'll be increasingly AI-generated: AIs will automate more of the experiment design, coding, analysis, and writing. That raises worries about slop, sabotage, and the like. See Current AIs seem pretty misaligned to me (Ryan Greenblatt, 15th April 2026).When AIs are

Article preview — originally published by LessWrong. Full story at the source.

Read full story on LessWrong → More top stories

Aggregated and edited by the Scoop newsroom. We surface news from LessWrong alongside other reporting so you can compare coverage in one place. Editorial policy · Corrections · About Scoop

Third-parties should focus on scrutinising system cards

More in agentic-ai