Why Even Experts Don’t Know What to Do About AI Risk
AI Safety veteran Holden Karnofsky thinks there’s a 49% chance his actions are making things worse.[1]In 2025, Jesse Clifton even stepped down as the executive director of the Center on Long-Term risk because of similar reasons.Even top AI Safety strategists don’t know what will make things better, and what will make things worse.Why is it so hard to improve humanity’s odds?And what can you do to choose your actions?1) Hidden Failure Lets You Fail Without Knowing It In AI Safety, impact is hard to measure, and thus lack of impact is often invisible. We call this "hidden failure". With hidden failure, projects fail to have a positive impact but the people doing the project don’t realise it.To understand where hidden failure comes from, it’s useful to understand reasons why projects fail in general. These reasons fall on a spectrum:Wrong problem: You're addressing something with little influence on x-risk. For example, researching AI fairness when the core risk is misalignment.Wrong solution: Your solution doesn't solve the problem, even when competently executed. E.g. interpretability research that's technically novel but isn’t actually helpful.Poor execution: Your problem-solution set could be impactful but you're not executing your solution competently enough.These factors can cause problems with both of the things you need to be impactful – adoption and effectiveness:A lack of adoption is relatively easy to spot if you want to[2] and can be remedied by entrepreneurial iteration.A lack of impact-effectiveness,[3] in contrast, can be particularly hard to spot, and that’s what we’re calling “hidden failure” in this post.With hidden failure, you might have users, citations, and funding (i.e. you have “adoption”), and still fail to have impact or even make things worse.Let us put that more bluntly: It’s literally possible for all your friends to think you’re successful and still be making things worse. Even within AI Safety. Even outside of frontier labs.2) Why impact i