Why aren't there more AlphaFolds?
This essay began as a talk at the 2026 Gold Lab Symposium. You can watch the talk itself here. The content is the same, but I've expanded on it more here.The universe doesn't give up its secrets easily. Every time we learn some truth, we depend on prior, equally hard-won knowledge, and usually some luck. The easy truths are getting rarer, so the excitement around speeding up science with deep learning models is understandable, especially after LLMs surprised us by learning so much just from predicting text. Today's frontier LLMs are trained on almost the entirety of human scientific output. So why haven't they transformed our understanding of the universe? And why is AlphaFold, a deep learning model for predicting protein structures (and very much not an LLM), one of only a handful of models to have transformed science? Why aren't there more AlphaFolds?The answer hinges on the difference between training models on human-generated text, which is itself downstream of knowledge acquired the hard way, and training them on observations at the limits of science where the definitive text is yet to be written. It's easiest to see this by studying the process of scientific discovery. So let's start by looking at two examples, with the first being the story of one of the most deadly diseases of the age of discovery: scurvy.Scorbutic and ConfusedOn July 8th 1497, the Portuguese mariner Vasco da Gama sails from Lisbon to find a path from Europe to India via sea, commanding an armada of four ships and 170 crew. Six months later, in January 1498, after sailing deep into the South Atlantic Ocean in search of westerly winds, they arrive at the Mozambique coast near Quelimane. The ships' journal[1]:“Many of our men fell ill here, their feet and hands swelling, and their gums growing over their teeth, so that they could not eat.”Vasco da Gama's path to India around Africa. The stop at Mozambique is circled in red.We know what's happening to these sailors. Their vitamin C reserves are