agentic-ai

Alignment as Equilibrium Design

LessWrong · May 10, 2026, 6:56 PM

Much of the alignment literature starts with the question of what are “human values”, “ethical behavior”, or “morality”, and how we can get models to act in accordance with them. This is an important question, but we argue that it can obscure a more fundamental technical problem of AI alignment.There is another perspective on alignment, rooted not in moral philosophy but in economics and mechanism design[1]. It originates in the study of human alignment to human values through incentives and correction, and our new paper studies AI alignment from this perspective. In this post I'll explain the more philosophical aspects of this work, the more technical audience is referred to the paper on arxiv and references therein. Gary Becker and the “Rational Offender” modelOur starting point is the economic theory of how to align humans to human values. This has been considered in the classic “Rational Offender” model by Gary Becker.Becker argued that crime could be modeled not primarily as the product of the inherent ethics of people, but as the result of incentives in an economic system. An offender[2] weighs the gain from misconduct against the probability of detection and the severity of punishment. Increase detection, increase penalties, or reduce the gains from misconduct, and the incentives for crime change resulting in a more different equilibrium. The same considerations can be applied to law enforcement and the judicial system. From the economic viewpoint, a judge is not assumed to be a human being of flawless moral character. Instead, the system is designed so that the incentives of judges — such as high salary, professional reputation, and legal accountability — outweigh the temptation to collude with criminal actors.In this view, social welfare becomes an equilibrium-design problem. By changing incentives, penalties, and oversight mechanisms, we change the game itself — and therefore the equilibrium behavior of rational actors.The analogy to AI should be understoo

Article preview — originally published by LessWrong. Full story at the source.

Read full story on LessWrong → More top stories

Aggregated and edited by the Scoop newsroom. We surface news from LessWrong alongside other reporting so you can compare coverage in one place. Editorial policy · Corrections · About Scoop

Alignment as Equilibrium Design

More in agentic-ai