latentbrief
← Back to concepts

Protein Structure Prediction

The computational challenge of determining the three-dimensional shape a protein will fold into from its amino acid sequence alone - a problem AI has recently solved at near-experimental accuracy.

Added May 21, 2026 · 2 min read

Protein structure prediction is foundational to drug discovery, vaccine development, and understanding disease mechanisms. AI making this fast and cheap has accelerated biology research across many areas - from basic science to industrial enzyme design - by removing a major bottleneck.

Proteins are chains of amino acids that fold into precise three-dimensional shapes. That shape determines their function: how they bind to other molecules, catalyse reactions, or serve as structural components of cells. For decades, determining a proteins structure required painstaking experimental work using X-ray crystallography or cryo-electron microscopy - techniques that take months and significant resources per protein.

Protein structure prediction - determining the 3D shape from the amino acid sequence alone - was considered one of the hardest problems in computational biology. In 2020, DeepMinds AlphaFold2 achieved near-experimental accuracy on the CASP benchmark, widely considered a breakthrough. The system uses a transformer-based architecture that attends to the full sequence and to a multiple sequence alignment (related sequences from other organisms) to predict the relative positions of each atom in the protein.

The key insight in AlphaFold2 is using evolutionary information: amino acids that are far apart in the sequence but close in space tend to co-evolve - mutations in one are compensated by correlated mutations in the other. These co-evolutionary signals appear in the multiple sequence alignment and inform the spatial predictions.

The practical consequences have been significant. DeepMind released a database of predicted structures for nearly all proteins in UniProt - over 200 million structures - transforming biological research. Scientists can now look up reliable structural predictions for proteins that would have taken years to characterise experimentally.

AlphaFold3 extended the approach to predict interactions between proteins, DNA, RNA, and small molecules - moving from structure to function.

Analogy

Predicting the shape of a paper crane from a folding instruction sheet, without actually folding it. The amino acid sequence is the instruction sheet; the 3D structure is the resulting shape. The challenge is that evolution has compressed millions of years of folding experiments into the sequence, and reading that information requires understanding patterns across all related sequences.

Real-world example

AlphaFold2 was applied to the malaria parasite proteome, predicting structures for proteins that had never been characterised experimentally. This enabled researchers to identify potential drug targets for malaria treatment - proteins with structural features that could be inhibited by small molecules - without waiting years for experimental structure determination.

Why it matters

Protein structure prediction is foundational to drug discovery, vaccine development, and understanding disease mechanisms. AI making this fast and cheap has accelerated biology research across many areas - from basic science to industrial enzyme design - by removing a major bottleneck.

In the news

Related concepts