agentic-ai

A brief list of ways AI safety efforts could be net negative

LessWrong · Jun 19, 2026, 4:12 PM

Here’s Holden Karnofsky:I tend to think it’s worse than 51/49. I tend to think we’re always going to be prone to overestimate how robustly good our actions are. And the more we learn about all the galaxy-brained considerations that one should have had in one’s head, the more it’s going to be like 50+ε%. I think AI safety is a great cause to work in. I’m excited to work in it. I think it’s high impact. I am doing my best to do things that I will be proud to have done and hope for the best. But I really do have to live with the possibility that my ultimate impact on the utilons or whatever is going to be negative.I’m not aware of a good list of downside risks for AI safety broadly[1], so I decided to make one.This is not intended to be fully comprehensive, these are just the ones that I personally take seriously[2][3]:AI governance interventions are obviously high-variance: bad regulation can easily make things worse, many interventions could increase the risk of great power conflict, increased political polarization around AI could be really bad, more centralization of power increases authoritarianism risk, more decentralization of power increases misuse risk, and so on. And technical work can have flow-through effects on these variables that outweigh its direct effects.[4]Activist work can polarize people against the cause.[5]Human takeover might be worse than AI takeover, and many AI safety interventions effectively attempt[6] to make human takeover more likely relative to AI takeover.If powerful AI will be well-described as doing humanlike roleplaying, trying to control it could make it eventually dislike its “oppressors”, or make it less “mentally healthy” in some way. And even without that assumption, AI safety work could lead to an adversarial relationship with AI in other ways.Future AIs may be moral patients themselves, which would substantially reduce the value of preventing human extinction, and increase the downside risk (including S-risk) of “AI control”-

Article preview — originally published by LessWrong. Full story at the source.

Read full story on LessWrong → More top stories

Aggregated and edited by the Scoop newsroom. We surface news from LessWrong alongside other reporting so you can compare coverage in one place. Editorial policy · Corrections · About Scoop

A brief list of ways AI safety efforts could be net negative

More in agentic-ai