Scoopfeeds — Intelligent news, curated.
agentic-ai

fab: how to do (alignment) research at scale

LessWrong · Jun 25, 2026, 9:46 PM

Over the last month, I've been working on a project I'm calling fab.[1]Nominally, it's an interface that enables a human researcher to make sense of research produced by many agents running in parallel. I haven't "finished" fab — in fact, I'm stuck searching for the crux of building something like this — but I still think it is worth posting a short explanation of the problem it is meant to solve, and how it tries to address it.In what follows I'm imagining this being used for automated alignment research. This is not because I see it as a silver bullet. I am just observing that a lot of current empirical alignment work could be automated (and in fact some of it already has been, just not the more open-ended stuff). If we can shift more work to the left through automation, that's a win. I am deliberately assuming humans as the final decision-makers, and asking how far we can get by augmenting human judgement.[2]Research at scaleImagine a near future where you can spin up dozens of agents to do research for you in parallel.[3]You loosely specify a question you're interested in, and they do all the legwork: operationalise that question, look at prior work, run quick experiments to get a mechanistic understanding, run bigger experiments (training models, perhaps, or doing white-box interpretability), analyse the results, position the findings in the broader context. Ideally, they all take slightly different approaches, trying to find different "handles" on the question you specified. When they're done, you have on the order of tens of write-ups to review, with the ultimate goal of updating your understanding of that question in light of new evidence.I think this is actually really hard, for a couple of reasons that have to do with the interplay between how we do research and current agent failure modes.[4]Attention is the big problem. There are only so many human researchers in a given field. In alignment, there are maybe a few thousand in total, with maybe 30 or so wh

Article preview — originally published by LessWrong. Full story at the source.
Read full story on LessWrong → More top stories
Aggregated and edited by the Scoop newsroom. We surface news from LessWrong alongside other reporting so you can compare coverage in one place. Editorial policy · Corrections · About Scoop