Deep Learning-Drives Insights into Protein-Protein Interactions

Researchers demonstrate a real-world large-scale application of deep neural network models for discovering novel protein-protein interactions.

The Science

Protein sequencing allows scientists to identify the amino acids in a protein. These amino acids determine the shape and function of a protein. DeepMind’s AlphaFold 2 is an artificial intelligence system originally designed to predict the shapes of a single protein sequence. In this research, scientists used AlphaFold 2 to develop a powerful deep learning approach for predicting and modeling multi-protein interactions. The approach, AF2Complex, generates much more accurate structural models than previous methods for modeling a protein complex. AF2Complex can even predict novel protein-protein interactions. As a proof of concept, the researchers used AF2Complex to virtually screen the proteins in the pathways that create outer membranes in E. coli. This led to the discovery of unexpected protein-protein interactions.

The Impact

Protein-protein interactions are essential for life. AF2Complex provides a powerful computational approach for detecting and modeling these interactions. The approach can be applied to the entire complement of proteins in a cell. As a large-scale proof of concept, researchers used AF2Complex to examine an essential E. coli pathway. The work showcases the promise of a deep learning-based research strategy for studying biological systems. It could help researchers understand many other biological systems by discovering novel protein-protein interactions and offering high-quality predictions of their complex structures.


Life depends on molecular machines made of proteins that interact with each other to form functional complexes. Researchers need accurate descriptions of protein-protein interactions to understand molecular biosystems, but obtaining such descriptions is very challenging, especially for theoretical approaches. Until now, protein-protein interactions were mainly discovered and characterized by experimental approaches. Generalizing AlphaFold 2, a powerful deep learning algorithm for predicting protein structures from sequence, researchers at Georgia Institute of Technology and Oak Ridge National Laboratory proposed a computational approach, AF2Complex, to not only predict the atomic structural models of interacting proteins, but also to predict whether multiple proteins interact, even if they experience transient interactions that are difficult to capture experimentally.

Scientists know that such hard-to-capture examples occur in a bacterial pathway that helps the translocation and folding of outer membrane proteins. The team conducted virtual protein-protein interaction screening enabled by AF2Complex for several key proteins of this pathway against about 1,500 proteins within the cell envelope of E. coli. The study used the Summit supercomputer at Oak Ridge National Laboratory. Among the top confident hits, the researchers identified both known interacting partners and high confidence unexpected hits with implications for the outer membrane biogenesis pathway. The resulting structural models of supercomplexes reveal multiple conformations that explain previous experimental observations and inspire new mechanistic hypotheses for understanding how outer membrane proteins are made.

Principal Investigator

Mu Gao
Georgia Institute of Technology
[email protected]

Related Links

BER Program Manager

Ramana Madupu

U.S. Department of Energy, Biological and Environmental Research (SC-33)
Biological Systems Science Division
[email protected]


This work was funded in part by the Department of Energy Office of Science, Office of Biological and Environmental Research and by the National Institutes of Health Division of General Medical Sciences. The research used resources supported in part by the Director’s Discretion Project at the Oak Ridge Leadership Computing Facility and the Advanced Scientific Computing Research Leadership Computing Challenge program. The researchers are grateful for access to the computing resources provided by the Leadership Computing Facility at Oak Ridge, the National Energy Research Scientific Computing Center at Lawrence Berkeley National Laboratory, and the Partnership for an Advanced Computing Environment at the Georgia Institute of Technology.


Gao, M., et al., AF2Complex predicts direct physical interactions in multimeric proteins with deep learning. Nature Communications 13, 1744 (2022). [DOI:10.1038/s41467-022-29394-2]

Gao, M., Nakajima An, D., & Skolnick, J. Deep learning-driven insights into super protein complexes for outer membrane protein biogenesis in bacteria. eLife 11, e82885 (2022). [DOI:10.7554/eLife.82885]