DeepMind’s AI Solves an Old Grand Challenge of Biology

DeepMind’s AI Solves an Old Grand Challenge of Biology

How DeepMind's Protein-folding AI is solving the Oldest Challenge of Biology

Proteins are essential to life, supporting practically all its functions. They are large complex molecules made from chains of amino acids. What a protein does mostly depends on its unique 3D structure. Understanding what shapes proteins fold into is known as the 'protein folding problem,' and has stood as a grand challenge in biology for the past 50 years. In a significant scientific advance, the artificial intelligence group DeepMind's latest version of the AI system AlphaFold has been detected to solve this grand challenge by the organisers of the biennial Critical Assessment of Protein Structure Prediction (CASP). This breakthrough demonstrates the impact AI can have on fundamental fields that explain and shape the world.

A protein's shape is closely associated with its function and the ability to predict this structure unlocks a greater understanding of what it does and how it works. Many of the world's most significant challenges, i.e., developing treatments for diseases or finding enzymes that break down industrial waste are fundamentally tied to proteins and their role.

It has been a focus of intensive scientific research for many years, using various experimental techniques to examine and determine protein structures, like nuclear magnetic resonance and X-ray crystallography. These methods and the latest techniques like cryo-electron microscopy depend on extensive trial and error that can take years of painstaking and laborious work per structure and require the use of multi-million dollar specialised equipment.

"It's a very substantial advance," says a systems biologist at Columbia University who has developed his software for predicting protein structure, Mohammed AlQuraishi. "It's something I didn't expect to happen this rapidly. It's shocking, in a way."

"This is a big deal," says David Baker, head of the Institute for Protein Design at the University of Washington and leader of the team behind Rosetta, a family of protein analysis tools. "This is an incredible achievement, like what they did with Go."

Astronomical Numbers

Recognising a protein's structure is very hard. Researchers have the sequence of amino acids in the ribbon but not the contorted shape they fold into for most proteins. And there are usually an astronomical number of possible forms for each sequence. Researchers have been wrestling with the problem since the 1970s, when Christian Anfinsen won the Nobel Prize for showing that sequences determined structure.

The launch of CASP in 1994 boosted the field. Every two years, the organisers release 100 or so amino acid sequences for proteins whose shapes have been identified in the lab but not yet made public. Many teams worldwide then compete to find the correct way to fold them up using the software. Medical researchers already use a lot of the tools developed for CASP. But progress was slow, with two decades of incremental advances failing to produce a shortcut to detailed lab work.

CASP got the jolt it was searching for when DeepMind entered the competition in 2018 with its first version of AlphaFold. It still could not match a lab's accuracy, but it left other computational techniques in the dust. Many researchers took note and soon adapted their systems to work more like AlphaFold.

In 2020, over half of the entries use some form of deep learning, states Moult. As a result, the accuracy was higher. Baker's new system called trRosetta uses some of DeepMind's ideas from 2018, though it still came a "very distant second," he adds.

DeepMind says it plans to study leishmaniasis, sleeping sickness, and malaria, all tropical diseases caused by parasites linked to many unknown protein structures.

One setback of AlphaFold is that it is slow compared to rival techniques. AlQuraishi's system uses an algorithm called a recurrent geometrical network (RGN). It can find protein structures a million times faster, returning outcomes in seconds rather than days. Although its predictions are less accurate, speed is more critical for some applications, he says.

Researchers are now trying to discover how AlphaFold exactly works. Once they describe to the world how they do it, a thousand flowers will bloom," Baker says. "People will be using it for all types of different things that we can't imagine now."

Related Stories

No stories found.
logo
Analytics Insight
www.analyticsinsight.net