Breaking Through Computational Barriers to Create Designer Proteins

The Science

Designing proteins is a massive combinatorial problem. Scientists must consider how protein building blocks, amino acids, interact with each other in ways that drive their spatial position and orientation, resulting in three-dimensional protein structures. Then, they use a protein design algorithm to find proteins that perfectly pair with each other. This is particularly difficult when looking, among a database of [thousands to millions], for combinations of two different proteins that exclusively bind to one another. These protein pairs must have backbone shapes that only complement each other. Using advanced computational methods to find working designs, researchers created six protein pairs of this type in cells.

The Impact

If scientists could engineer pairs of proteins that bind only to one another, they could have much more control over cells in living systems. This ability could enable bioengineering applications with large impacts for medicine and biomaterials. Currently, scientists can only design DNA (not proteins themselves) to form these interactions. Being able to encode DNA gave rise to technologies such as DNA origami and artificial circuits. A general method for creating protein pairs would also be very powerful, opening the door to many more possibilities.

Summary

Using advanced computing, scientists designed protein pairs that perfectly complement each other. This work used the Rosetta software, which has a long history of being used for protein modeling, analysis, and design. Past helical bundle design work had focused on single-molecule bundles or on homooligomers (assemblies of many copies of the same molecule). With the pairing of two proteins, the coiled-coil parameter space is incredibly vast. Using the Rosetta software suite, the team used the Mira supercomputer at Argonne National Laboratory to sample conformations efficiently, through a massively parallelized grid search of 11 parameters, to find 87 million (20 million untwisted and 60 million left-handed supercoiled) unique working designs for four-helix backbones (35 residues each). The team then searched exhaustively for unique hydrogen-bond networks that connected all four helices, finding 2,251 unique networks. Low-energy sequences were then identified using the RosettaDesign server to test compatible placements of the hydrogen-bond networks within all four-helix candidates. Of the 97 computationally selected designs that were stable and satisfied additional criteria, 94 were well expressed in Escherichia coli, 85 had the expected size as measured with size-exclusion chromatography, 65 formed constitutive heterodimers, and 39 were exclusive heterodimers. Four designs that were selected to be validated against experimental data using X-ray crystallography were found to be in good agreement with the computational models, confirming the predicted hydrogen-bond networks that were designed into the structure. The team also investigated rearranging the hydrogen-bond networks in different helical repeat units to expand the heterodimer set. This rearrangement was largely successful, generating 22 new constitutive heterodimers. In the end, the team created six fully orthogonal protein heterodimer pairs in E. coli cells. This work provides a path forward for computationally designing specific, programmable binding into proteins, previously a property found only in the DNA and RNA world.

Principal Investigator

David Baker
University of Washington
[email protected]

Related Links

BER Program Manager

Amy Swain

U.S. Department of Energy, Biological and Environmental Research (SC-33)
Biological Systems Science Division
[email protected]

Funding

The project received funding from the Office of Biological and Environmental Research, within the U.S. Department of Energy (DOE) Office of Science, and the National Institutes of Health (NIH) at the Advanced Light Source, a DOE Office of Science user facility at Argonne National Laboratory (ANL). The project used the Argonne Leadership Computing Facility, another DOE Office of Science user facility, to run the program.

Funding was also received from the Howard Hughes Medical Institute, Schmidt Futures Program, European Research Area Network (ERA-NET) BioOrigami Consortium, National Science Foundation, Burroughs Wellcome Fund Career Award at the Scientific Interface, German Research Foundation, Raymond and Beverly Sackler Fellowship, Institute for Protein Design, and Washington Research Foundation.

References

Chen, Z., S. E. Boyken, M. Jia, F. Busch, D. Flores-Solis, M. J. Bick, P. Lu, Z. L. VanAernum, A. Sahasrabuddhe, R. A. Langan, S. Bermeo, T. J. Brunette, V. Khipple Mulligan, L. P. Carter, F. DiMaio, N. G. Sgourakis, V. H. Wysocki and D. Baker. 2019. “Programmable Design of Orthogonal Protein Heterodimers,” Nature 565, 106–11. DOI:10.1038/s41586-018-0802-y.