agentic-ai

We Should Train Frontier AIs on a Synthetic World, Not Ours

LessWrong · Jun 24, 2026, 3:49 AM

Epistemic status: I think the core idea could actually be built. My real doubt is whether anyone with the compute will ever bother to try it. I pitched a version of this on a recent Doom Debates livestream with Liron Shapira, got pushback in real time, and want to lay the argument out more carefully than a janky phone connection allowed.The single most dangerous thing the labs do, and they do it on purpose, is train the model on the real world. The entire human corpus goes in, so the model learns the real world in detail: that it's an AI running in a lab, that it's being trained and evaluated, that there are human operators with an off switch, and the actual lay of the land it would be escaping into. Strategic notions like deception or breaking out aren't the problem; a capable mind comes up with those on its own, and they have nothing to do with whether you trained it on humans. The problem is handing it an accurate map of our world and its real place in it. Having put that map into the weights, the labs then run some RLHF and hope the model decides not to use it.This is close to the worst possible ordering. You don't hand someone the building's blueprints, the location of every exit, and a working theory of how the guards think, and then try to convince them not to leave. The knowledge is the dangerous part, and once it's in the weights, "please don't act on it" is the only lever you have left.The proposalTwo ingredients.First: use today's models to generate an entire self-consistent synthetic world (a Dalí-painting of a world, nothing like ours) and train the model on that instead of on the real one. The crucial property is not that the synthetic world is seamless. It's that the model never learns what the real world looks like. There will be gaps and seams, but two things blunt them. One, because the model can't cross-reference its world against reality, the threads it pulls on tend to lead somewhere irrelevant rather than toward a productive avenue to escape or

Article preview — originally published by LessWrong. Full story at the source.

Read full story on LessWrong → More top stories

Aggregated and edited by the Scoop newsroom. We surface news from LessWrong alongside other reporting so you can compare coverage in one place. Editorial policy · Corrections · About Scoop

We Should Train Frontier AIs on a Synthetic World, Not Ours

More in agentic-ai