Today (September 21), the Lasker Foundation announced this year’s award winners. John Jumper, a computational biologist at DeepMind, and Demis Hassabis, cofounder and CEO at DeepMind, were awarded the 2023 Albert Lasker Basic Medical Research Award for “the invention of AlphaFold, the artificial intelligence (AI) system that solved the long-standing challenge of predicting the three-dimensional structures of proteins from the one-dimensional sequence of their amino acids,” announced the Lasker Foundation.
Jumper and Hassabis led the AlphaFold team that revolutionized the field of structural biology by accelerating the process of protein structure prediction with speed and accuracy. Their approach melded together different backgrounds and disciplines, and researchers have adopted the platform to answer diverse biological questions.
Small molecule, big problem
Between 1957 and 1960, Nobel Prize laureate John Kendrew, a biochemist at the Medical Research Council Laboratory of Molecular Biology, resolved the first structural model of a globular protein, myoglobin.1 After studying the results, 1972 Nobel Prize laureate Christian Anfinsen, a biochemist at the National Institutes of Health, postulated that, in theory, a protein’s amino acid sequence should fully determine its structure. However, protein structure was notoriously difficult to study.
Scientists heavily relied on X-ray crystallography for decades for protein identification studies, but researchers could spend years attempting to crystalize proteins. Then, the invention of cryogenic electron microscopy (cryo-EM) shed some light on proteins’ elusive structures, but the microscope images often had low resolutions. It took years to slowly advance cryo-EM, but by 2019, scientists used cryo-EM to determine the structures of almost 4,000 proteins in the Protein Data Bank (PDB) out of 150,000 entries.2 This is only a fraction of the estimated tens of millions of protein sequences.
Automating the protein path
To scale up protein structure predictions, researchers turned to artificial intelligence. In 1994, John Moult and Krzysztof Fidelis, both computational biologists at the University of Maryland, founded the Critical Assessment of Structural Prediction (CASP) competition, a biannual test designed for groups to predict the three-dimensional structures of several proteins that were already verified experimentally but not released publicly. Teams received accuracy-based scores out of 100 using the Global Distance Test (GDT).3 Since CASP’s inaugural event in 1994, the average score steadily increased from 20 to more than 50. According to the organizers, 90 is the threshold for meeting experimental values.
One of the earliest approaches was developed by David Baker, a biochemist and computational biologist at the University of Washington. He used short segments from the PDB to predict protein structures. Using this model, Rosetta, Baker and his team made several iterations that consistently improved the program’s performances in early 2000s CASP competitions. However, historical progress in CASP stagnated.
Now, AlphaFold’s effects can been seen as a transformative technology that can be as big as a new microscope technology. Now you can see things that you couldn't see before.
-Martin Steinegger, Seoul National University
DeepMind, an artificial intelligence company cofounded by Hassabis in 2010, succeeded in designing AI that could beat human players at chess, and the more challenging game of go (AlphaGo). As Hassabis watched AlphaGo play, it reminded him of Baker’s online game FoldIt, which was released in 2008, where players explored and created accurate protein structure models. Shortly after AlphaGo’s success in 2016, DeepMind aimed to tackle the next challenge: protein folding.
In 2016, Hassabis believed that his team could create a protein prediction system with machine learning as a core component of the system. This would be one of the first of its kind, debuting at the competition in 2018 as AlphaFold1.4 Machine learning contrasted the traditional AI approaches that relied on preconceived logic by running through iterations of the data to discover patterns.
Hassabis’ and Jumper’s team won that year’s CASP, and AlphaFold1 left quite an impression for creating highly accurate structures for 24 out of 43 modeling domains. AlphaFold1 starkly outperformed the next best method, which achieved 14 out of 43 domains. However, the AlphaFold team knew that it hadn’t reached its full potential to serve biologists; there was more work to be done.
Soon after, Jumper took the lead in redesigning the AlphaFold algorithm with an interdisciplinary team of biologists, chemists, and biophysicists. Hassabis, Jumper, and the AlphaFold team brainstormed ways to finetune the algorithm to ensure that AlphaFold2 learned efficiently.
Incorporating a larger database for training greatly helped increase the accuracy of the software’s prediction capabilities. “My contribution was done mainly to deliver them these bigger, more comprehensive metagenomic protein databases that they can use for the training part or also for the inference part at the end,” said Martin Steinegger, a computational biologist at Seoul National University who helped develop AlphaFold2.
At the next CASP competition in 2020, AlphaFold2 stunned the attendees with staggering accuracy and achieved a median score of 92.4 GDT overall across all targets.5 This kind of accuracy rivaled experimental techniques and boasted an average error comparable to the width of an atom. This iteration of AlphaFold succeded in part because of the complicated architecture built by researchers from different backgrounds and disciplines.
See also “DeepMind AI Speeds Up the Time to Determine Proteins’ Structures”
Tobin Sosnick, a biochemist, biophysicist, and Jumper’s doctoral advisor at the University of Chicago, said, "He took the folding principles he worked on here [for his doctoral work] and successfully applied them in combination with AI at DeepMind.”
Then in 2021, DeepMind publicly released the source code for AlphaFold and its impressive database of more than 350,000 proteins in collaboration with the European Bioinformatics Institute at the European Molecular Biology Laboratory.6 This database has since grown to more than 200 million structures.
“It was AlphaFold that pushed the accuracy over a critical limit where people were now saying the sequence-to-structure problem has largely been solved,” said Sosnick. The accessibility of this powerful tool inspired its widespread adoption and allowed researchers to fill in the gaps of their own experimental research.
It’s an honor to receive this award in recognition of the work of our team. The work on AlphaFold has been such an incredible experience, and we’re only just beginning to see how AI will help us transform biology.
-John Jumper, DeepMind
“Normally, computational work was seen as the sidekick for experimental work,” said Steinegger. “Now, AlphaFold’s effects can been seen as a transformative technology that can be as big as a new microscope technology. Now you can see things that you couldn't see before.”
Jumper, Hassabis, and their team tackled a problem that stumped scientists for half a century. This AI tool ushered in a new era of studying proteins for understanding biological functions and guiding drug development. These advances in AI technology fundamentally changed the ways that scientists address problems.
"It’s an honor to receive this award in recognition of the work of our team. The work on AlphaFold has been such an incredible experience, and we’re only just beginning to see how AI will help us transform biology,” said Jumper.
See also “2022 Lasker Award Winners Announced”
References
- Kendrew JC, et al. A three-dimensional model of the myoglobin molecule obtained by X-ray analysis. Nature. 1958;181:662–666.
- Benjin X, Ling L. Developments, applications, and prospects of cryo-electron microscopy. Protein Sci. 2020;29:872–882.
- Zemla A. LGA: A method for finding 3D similarities in protein structures. Nucleic Acids Res. 2003;31(13):3370-3374.
- Kryshtafovych A, et al. Critical assessment of methods of protein structure prediction (CASP)-Round XIII. Proteins. 2019;87(12):1011-1020.
- Moult J, et al. Critical assessment of techniques for protein structure prediction, fourteenth round. CASP 14 Abstract Book. 2020.
- Senior AW, et al. Improved protein structure prediction using potentials from deep learning. Nature. 2020;577:706–710.