Back to Blog
AlphaFoldAIProtein Structure

The AlphaFold Revolution: How AI Solved Protein Structure Prediction

How DeepMind's AlphaFold transformed structural biology, what it means for drug discovery, and how you can use it in your own research.

Sudipta SardarMarch 5, 202510 min read
The AlphaFold Revolution: How AI Solved Protein Structure Prediction

In December 2020, DeepMind's AlphaFold2 achieved something that had eluded scientists for half a century: accurately predicting protein three-dimensional structures from amino acid sequences alone. At CASP14, the Critical Assessment of protein Structure Prediction competition, AlphaFold achieved a median GDT score of 92.4 — a performance comparable to experimental methods.

The Protein Folding Problem

Proteins are molecular machines made of amino acid chains that fold into specific 3D shapes. This shape determines function. A misfolded protein can cause diseases like Alzheimer's and Parkinson's. Understanding the rules that govern folding has been one of biology's grand challenges since Christian Anfinsen's Nobel Prize-winning work in 1972.

Experimental methods like X-ray crystallography and cryo-electron microscopy can determine structures, but they are expensive, time-consuming, and do not work for all proteins. Before AlphaFold, computational predictions were often too inaccurate to be scientifically useful.

How AlphaFold Works

AlphaFold2 uses a deep learning architecture called the Evoformer, which processes multiple sequence alignments (MSAs) and pairwise residue features through an attention-based neural network. The key insight is that co-evolutionary information — patterns of correlated mutations across species — encodes structural constraints.

The model was trained on approximately 170,000 experimentally determined structures from the Protein Data Bank (PDB). It iteratively refines its predictions through a structure module that directly outputs 3D coordinates, along with per-residue confidence scores called pLDDT.

Impact on Research

The AlphaFold Protein Structure Database now contains predicted structures for over 200 million proteins — essentially every known protein sequence. This has accelerated research in drug discovery, enzyme engineering, and evolutionary biology. Researchers can now generate structural hypotheses in seconds rather than months.

In drug discovery, predicted structures serve as starting points for virtual screening campaigns. In synthetic biology, engineers use AlphaFold to design novel proteins with desired functions. The latest version, AlphaFold3, extends predictions to protein complexes, DNA, RNA, and small molecules.

Using AlphaFold in Your Research

You can access pre-computed predictions through the AlphaFold Protein Structure Database (alphafold.ebi.ac.uk). For custom predictions, ColabFold provides a Google Colab notebook that runs AlphaFold2 using MMseqs2 for faster MSA generation. LocalColabFold allows you to run predictions on your own hardware.

Always check the pLDDT confidence scores. Regions with pLDDT above 90 are highly reliable. Regions between 70 and 90 are generally good. Below 50, the prediction is likely unreliable, often indicating intrinsically disordered regions.

All articles

Written by Sudipta Sardar