How AlphaFold Multimer Predicts Protein Complexes

Proteins are the molecular machinery of life, performing nearly every task within a cell, from catalyzing metabolic reactions to replicating DNA. They begin as a linear chain of amino acids, which must fold into a precise, three-dimensional shape to function correctly.

While a single protein chain is known as a monomer, the vast majority of cellular functions are carried out not by individual proteins but by assemblies of two or more interacting protein chains, known as protein complexes or multimers. Determining the exact atomic structure of these complexes was a formidable undertaking. Traditional experimental methods, such as X-ray crystallography, were often unable to solve the structures of these large, fragile assemblies.

AlphaFold Multimer (AFM) is an artificial intelligence system specifically designed to accurately predict the 3D structure of these intricate protein complexes directly from their amino acid sequences. This computational approach offers a rapid and reliable alternative to experimental structure determination.

The Challenge of Protein Complexes

These multimers are responsible for coordinating processes such as cell signaling, moving materials across cell membranes, and mounting an immune response. Faulty interactions within these complexes are frequently the root cause of many diseases, including cancers and neurodegenerative disorders.

Historically, structural biologists relied on techniques like X-ray crystallography, which requires proteins to be coaxed into forming a highly ordered crystal lattice. This process is time-consuming and costly, often failing entirely for large, flexible, or membrane-bound complexes. Since these experimental methods often capture a static snapshot and struggle with the complexity of multi-chain assemblies, the structures of thousands of important complexes remained unknown.

How AlphaFold Multimer Predicts Structures

AlphaFold Multimer employs a deep learning neural network trained on known protein structures. The system takes the amino acid sequences of the constituent proteins as input, along with information derived from multiple sequence alignments (MSAs).

This alignment step compares the input sequences to thousands of related proteins across different species, effectively identifying which amino acids have evolved together, suggesting they are physically close in the final 3D structure. The AI model simultaneously predicts two things for every pair of amino acids: the distance between them and the orientation, or angle, of the chemical bonds connecting them.

For a protein complex, the network must perform this prediction not only for amino acids within the same protein chain but also for those across different chains that form the interface of the complex. By refining these predicted distances and angles over several computational steps, the system is able to assemble the input sequences into a highly accurate, three-dimensional model of the entire multimer structure. This end-to-end approach, which natively handles the interactions between multiple chains, allows AlphaFold Multimer to outperform previous computational methods designed for single proteins.

Applications in Drug Discovery and Disease Modeling

The capacity to quickly and accurately predict protein complex structures has transformed the fields of drug discovery and disease modeling. One of the most immediate impacts is on rational drug design, where scientists can now visualize the precise contours of a multimer’s surface. This structural clarity facilitates the identification of deep pockets or binding sites where a small molecule drug can be designed to fit perfectly, either to inhibit or enhance the complex’s function.

The technology is also proving invaluable for understanding pathogen interactions, particularly in the context of infectious diseases. For example, AlphaFold Multimer can model how a viral protein, such as the SARS-CoV-2 spike protein, interacts with a human cell receptor like ACE2 to form a functional complex. By predicting the exact atomic interface between the pathogen and the host, researchers can gain insights into the mechanism of infection and accelerate the development of therapeutics that block this binding event.

Furthermore, the system accelerates basic research into genetic diseases that arise from faulty protein interactions. Researchers can model the effect of a genetic mutation on the structure of a complex, providing a molecular explanation for the disease and suggesting new avenues for therapeutic intervention.

Democratizing Science Through Accessibility

DeepMind released the underlying code and structural predictions as open-source resources. This move lowers the barrier to entry for structural biology research across the globe.

Researchers in laboratories without access to expensive, specialized equipment, such as powerful electron microscopes or X-ray synchrotron facilities, can now utilize structural data. This open accessibility has fostered greater scientific equity, enabling small research groups and scientists in developing nations to participate in cutting-edge structural biology projects.

DeepMind partnered with the European Molecular Biology Laboratory’s European Bioinformatics Institute (EMBL-EBI) to create the AlphaFold Protein Structure Database, which houses millions of predicted protein structures and makes them freely available. The computational models serve as highly informed hypotheses that must ultimately be tested and validated by experimental evidence to ensure their reliability for drug development and biological discovery.