For several years now, John McGeehan, a biologist and director of the Center for Enzyme Innovation in Portsmouth, England, has been looking for a molecule that can break down 150 million tonnes of soft drink bottles and other plastic waste littered globally.
Working with researchers on both sides of the Atlantic, he found a few good options. But his task is that of the most demanding locksmith: to pinpoint the chemical compounds that will twist and fold themselves into microscopic shapes that can fit snugly into the molecules of a plastic bottle and separate them, like a key that opens a door. .
Nowadays, determining the exact chemical content of any enzyme is quite a simple challenge. But determining its three-dimensional shape could require years of biochemical testing. So last fall, after reading that an artificial intelligence lab in London called DeepMind had built a system that automatically predicted the shape of enzymes and other proteins, Dr. McGeehan asked the lab if it could help his project.
At the end of a week of work, he sent DeepMind a list of seven enzymes. The following Monday, the lab returned the shapes to all seven. “This puts us a year ahead of where we are, if not two,” said Dr. McGeehan.
Now, any biochemist can speed up their work in a similar way. On Thursday, DeepMind published the predicted shapes of more than 350,000 proteins – the micromechanics that govern the behavior of bacteria, viruses, the human body and all other living things. This new database includes three-dimensional structures for all proteins expressed by the human genome, as well as structures for proteins found in 20 other organisms, including mice, fruit flies, and bacteria. E. coli bacteria.
This vast and detailed map of biology – providing some 250,000 previously unknown shapes – could accelerate our ability to understand disease, develop new drugs, and reuse existing drugs. It could also lead to new types of biological tools, like an enzyme that can effectively break down plastic bottles and convert them into materials that can be easily reused and recycled.
“This can put you ahead – influencing the way you’re thinking about problems and helping to solve them faster,” says Gira Bhabha, an assistant professor in the department of cell biology at New York University. “. “Whether you study neuroscience or immunology – whatever your field of biology – this can be helpful.”
This new knowledge is key in its own right: If scientists can determine the shape of a protein, they can determine how other molecules will bind to it. This could reveal how bacteria are resistant to antibiotics – and how to fight that resistance. Bacteria are resistant to antibiotics by expressing certain proteins; If scientists can determine the shape of these proteins, they can develop new antibiotics or new drugs to block them.
In the past, determining the exact shape of a protein required months, years, or even decades of trial-and-error experiments involving X-rays, microscopes, and other tools. laboratory bench. But DeepMind can dramatically narrow the timeline with its AI technology, called AlphaFold.
When Dr. McGeehan sent DeepMind his list of seven enzymes, he told the lab he had determined shapes for two of them, but he didn’t say which two. This is one way to check how well the system works; AlphaFold passed the test, correctly predicting both shapes.
What’s more remarkable, says Dr. McGeehan, is that the predictions have come within days. Later, he learned that AlphaFold had in fact completed the task in just a few hours.
AlphaFold predicts protein structure using what is known as a neural network, a mathematical system that can learn tasks by analyzing large amounts of data – in this case thousands of known proteins. known and their physical form – and extrapolating to the unknown.
This is the same technology that identifies the commands you bark into your smartphone, recognizes faces in photos you post to Facebook, and translates language to language on Google Translate and other services. But many experts believe that AlphaFold is one of the most powerful uses of the technology.
“It shows that AI can do useful things in the midst of real-world complexity,” said Jack Clark, one of the authors of the AI Index.
As Dr. McGeehan has discovered, it can be remarkably accurate. AlphaFold can predict the shape of a protein with accuracy that outperforms physics experiments about 63% of the time, according to independent benchmark tests that compare its predictions with known protein structures. know. Most experts have assumed that a technology as powerful as this is still many years away.
Randy Read, a professor at the University of Cambridge, said: “I think it will take another 10 years. “This is a complete change.”
But the system’s accuracy doesn’t change, so some predictions in DeepMind’s database will be less useful than others. Each prediction in the database comes with a “confidence score” that indicates how likely it is to be correct. DeepMind researchers estimate that the system provides a “good” prediction about 95% of the time.
As a result, the system cannot completely replace physical experiments. It is used in conjunction with lab bench work, helping scientists determine which experiments they should run and filling in the gaps when experiments fail. Using AlphaFold, researchers at the University of Colorado Boulder, recently helped determine the protein structure they’ve been trying to identify for more than a decade.
The developers of DeepMind have chosen to freely share its database of protein structures rather than sell access, in the hope of advancing progress in the biological sciences. Demis Hassabis, chief executive officer and co-founder of DeepMind, which is owned by the same parent company as Google but operates more like a research lab than a commercial enterprise, said: maximum impact.
Some scientists have compared DeepMind’s new database to the Human Genome Project. Completed in 2003, the Human Genome Project has provided a map of all human genes. Now, DeepMind has provided a map of the roughly 20,000 proteins expressed by the human genome – another step towards understanding how our bodies work and how we can react when something goes wrong. .
The hope is also that the technology will continue to evolve. A lab at the University of Washington has built a similar system called RoseTTAFold, and like DeepMind, it has openly shared the computer code that controls its system. Anyone can use technology, and anyone can work to improve it.
Even before DeepMind began publicly sharing its technology and data, AlphaFold powered a wide range of projects. University of Colorado researchers are using this technology to understand how bacteria like E. coli and salmonella develop resistance to antibiotics and develop ways to combat this resistance. At the University of California, San Francisco, researchers used the tool to advance their understanding of the coronavirus.
The coronavirus wreaks havoc on the body through 26 different proteins. With help from AlphaFold, the researchers have improved their understanding of an important protein, and hope the technology can help enhance their understanding of the remaining 25 proteins.
If this comes too late to affect the current pandemic, it could help prepare for the next. Kliment Verba, one of the researchers in San Francisco, said: “A better understanding of these proteins will help us target not only this virus but other viruses as well.
The possibilities are countless. After DeepMind gave Dr. McGeehan the shapes of seven enzymes capable of eliminating the world’s plastic waste, he sent the lab a list of 93 other types. “They’re working on these right now,” he said.