agentic-ai

A system overview for near-term, low-trust AI compute verification

LessWrong · Jun 23, 2026, 10:59 AM

Version 0.2, working draft. This is a working draft of my current best idea for a privacy-preserving, retrofittable AI compute verification system, for confidence-building in an arms-control-like AI agreement between rival nation states. The purpose of this draft is to elicit community engagement by making use of Cunningham’s law: I make assertions about what the (emerging) field of AI verification should aim for, and people with experience in international policy, cybersecurity and any relevant field of engineering can point out what this draft gets wrong. Thank you to everyone who has provided feedback to version 0.1, especially Aaron Scher, Mauricio Baker and Jonathan Ng.1. Introduction and summaryIn order to plan and execute under tight timelines, one needs to make some strategic bets, instead of hedging too much and keeping all options open. The field of research on AI verification is bottlenecked partly by a lack of shared vision (as well as human capital, but having clear goals helps hiring and fundraising). With this post, I aim to:Make technical objectives for verification in high-stakes AI governance more specific and actionable (section 2).Contribute a first, high-level reference architecture for meeting these goals (section 3 and 4).Gather an overview of relevant work in the field, both prior and ongoing (section 5).You, the reader, may disagree with some of the items in my problem statement, or the technical approach of the reference architecture. You may find some points too under-specified to even be directionally useful. If you do, please share your insights as comments and explain your reasoning. The reference architecture will (hopefully) improve through red-teaming on the drawing board, before it is ready for red-teaming in the lab.The most important properties of the proposed system:Privacy of AI users and IP is preserved and only pre-agreed information is disclosed. Confidential data does not leave the monitored facility, not even in encrypted for

Article preview — originally published by LessWrong. Full story at the source.

Read full story on LessWrong → More top stories

Aggregated and edited by the Scoop newsroom. We surface news from LessWrong alongside other reporting so you can compare coverage in one place. Editorial policy · Corrections · About Scoop

A system overview for near-term, low-trust AI compute verification

More in agentic-ai