Rohin Shah on AGI Safety
Rohin Shah recently had an interview on 80000 hours on his views on AGI Safety and his work at Google Deep Mind. I'm posting the transcript below to encourage further discussion. I think the interview is interesting though I disagree on a bunch of topics, especially on alignment difficulty and Co T monitoring. Transcript Who’s Rohin Shah? [00:00:00]Rob Wiblin: Today I’m speaking with Rohin Shah, who is head of AGI alignment and safety at Google Deep Mind.I suppose, Rohin, you’ve ended up, for better or worse — hopefully for better — being one of the more influential, dare I even say powerful, people to come out of the AGI alignment and safety ecosystem and school of thought.You were generous enough to be super opinionated with me when you came on the show two years ago, and judging by the notes that you’ve sent over this week, you’re ready to be opinionated again.Thanks so much for coming back on the show, Rohin.Rohin Shah: Yeah, thanks a lot, Rob, and that’s a very generous intro. And in the interest of being very opinionated, I do want to emphasise that these opinions are mine alone. They’re not meant to represent the opinions of Google or Google DeepMind.Rob Wiblin: That’s how we like it. If you were representing Google DeepMind, it might sound more like a press release.Why Rohin thinks we won’t get catastrophic misalignment [00:00:49]Rob Wiblin: So you were really very early in the scheme of things to the whole misalignment, AI/AGI security issues. I suppose you got involved in 2017, so you’re in the first few percent of people who started working on this professionally. But despite that, you think that probably we’re not going to get catastrophic misalignment, that our chances are really pretty good, and that probably prosaic, ordinary alignment techniques — the kinds of things that Google DeepMind and other AI companies are doing — will probably succeed at preventing at least catastrophic misalignment. Why do you think our chances are so good?Rohin Shah: There’s a