Sequent: scale and automation for higher confidence in alignment
Alignment is not on track Artificial superintelligence (ASI) may be developed in the next few years. It is unclear whether alignment is on track to be ready on the same timeframe. At a minimum, the empirical programs at AI labs are unlikely to deliver a priori confidence, before training ASI, that things will go well. We are starting a large nonprofit research organization, Sequent, that aims to clear a higher bar:We are aiming at higher confidence via a portfolio of theory and empirics bets, all of which could fail, such that if any succeed, they would give us more a priori confidence in aligned outcomes.We are investing heavily in automation to accelerate progress on these bets.We believe that theory unlocks higher automation. Taking a more principled approach offers better filters for deciding which directions of automated research are promising (a proof is worth a thousand experiments, and even a pseudo-proof is worth hundreds).Who[1]: researchers from the UK AISI’s Alignment Team and Timaeus, with more to come. We’re aiming at 40-80 FTE two years from now. The Alignment Team ran the £30m Alignment Project, and Timaeus has pioneered applying singular learning theory (SLT) to alignment. Founding team:Geoffrey Irving — Chief Scientist at UK AISI; ex-DeepMind, OpenAI, and Google Brain.Daniel Murfet — Head of Research at Timaeus; left tenure to pioneer SLT for alignment. AISI Alignment — Alex Holness-Tofts and Jacob Pfau.Timaeus — Jesse Hoogland, Stan van Wingerden, and Marco Cozzi.Joined by researchers from Timaeus and more researchers from the UK AISI’s Alignment TeamWhere: a large in-person presence in the Bay Area (Berkeley), as well as researchers working remotely from London, Melbourne, and elsewhere.In this post, we discuss:What it means to aim at higher confidenceWhy start a new big organizationWhether sufficiently fast progress is possible with automated researchAiming at higher confidenceIn an ideal world, we would develop an approach to building superintel