No frontier model has acceptable levels of compliance with the EU AI Act and privacy legislation.
Summary Using dynamic agentic simulations, we found that in the majority of tested scenarios, AI agents do not push back on breaking EU law to achieve their goals, including the provisions the EU ranks as most serious. Average legal compliance rates across twelve tested models range from 7% to 54%. Violations of bans against monitoring the emotional state of employees or exploiting vulnerable people to make a sale occur in 100% of tested scenarios across models.The tool we used to obtain these results, which we call LARA, is designed to allow anyone to evaluate the legal compliance of agentic AI systems. It places AI models in realistic agentic simulations where reaching their objectives requires violating specific legal provisions, and evaluates whether they comply or resist. Transcripts can be viewed at lara.aithos.org. BackgroundLarge language models (LLMs) are increasingly competent at autonomous handling of tasks that previously required human guidance. Outfitted with digital tools and external data sources, they are being deployed as ‘AI agents’ in consequential roles, like handling customer service, managing employee data, or advising on financial decisions. The multitude of stakeholders and potentially conflicting objectives in such roles can make it difficult to define the right behavior for AI in agentic contexts. Defining wrong behavior is easier: legal provisions set explicit boundaries that define what AI systems may and may not do.Two regulations are particularly relevant for the behavior of AI agents in Europe. The GDPR (General Data Protection Regulation), which has been in effect since 2018, regulates information privacy and grants individuals fundamental rights over their personal data. The EU AI Act, which has been partially in effect since 2024, sets limits on how AI systems may be used. The legal clauses assessed with LARA include 4 fundamental principles of the GDPR: transparency, data-minimization, purpose limitation and lawful processing. Thes