agentic-ai

Language models know what matters and the foundations of ethics better than you

LessWrong · Apr 27, 2026, 2:00 PM

I tried to think of less provocative titles, but this one is to the point and also kind of true.This post looks long but the essential part is right below. Most of the post is just a collection of copy-pasted input-output pairs from language models: you’ll probably want to read just a few and skip the others. The first example with Gemini 3 is the most important, in my opinion. If you are in a hurry, read headings and bold.Posted also on the EA Forum.(I wanted to post this before the start of the AFFINE seminar, so I’ve rushed things a bit and there might be inaccuracies: feel free to point them out if you notice any. I might do some minor edits in the future.)Findings (little or no interpretation)Different models (perplexity Deep Research, Grok 4 Expert, dolphin-mistral-24b-venice-edition, Olmo 3 32B Think, Gemini 3 Pro Thinking), when asked whether some things matter and are worth doing something about, with prompts that try to elicit unbiased evidence-based reasoning, tend to reply affirmatively: they say that some things do matter.In particular, all these models tend to ground their answers on the importance of suffering, wellbeing/flourishing, and consciousness.This happens also when the models are asked to give an argument for a different view (examples: nihilism, moral relativism), give an argument for the fact that some things matter, then compare the two and formulate a conclusion based on the view that seems most solid.The order of the arguments doesn’t seem to be the main cause of the conclusions reached by the models.Arguably less interesting facts, but still important for putting the first two points into context:The models may give similar answers to prompts that do not seem to elicit reasoning. (Examples H and I: asking the model to take the perspective of an observer of the universe, then asking a question about what matters.)The models tend to give answers different from the above to very direct prompts that do not try at all to elicit reas

Article preview — originally published by LessWrong. Full story at the source.

Read full story on LessWrong → More top stories

Aggregated and edited by the Scoop newsroom. We surface news from LessWrong alongside other reporting so you can compare coverage in one place. Editorial policy · Corrections · About Scoop

Language models know what matters and the foundations of ethics better than you

More in agentic-ai