Department of Biochemistry, University of Washington, Seattle, WA, USA.
Institute for Protein Design, University of Washington, Seattle, WA, USA.
Nature. 2023 Aug;620(7976):1089-1100. doi: 10.1038/s41586-023-06415-8. Epub 2023 Jul 11.
There has been considerable recent progress in designing new proteins using deep-learning methods. Despite this progress, a general deep-learning framework for protein design that enables solution of a wide range of design challenges, including de novo binder design and design of higher-order symmetric architectures, has yet to be described. Diffusion models have had considerable success in image and language generative modelling but limited success when applied to protein modelling, probably due to the complexity of protein backbone geometry and sequence-structure relationships. Here we show that by fine-tuning the RoseTTAFold structure prediction network on protein structure denoising tasks, we obtain a generative model of protein backbones that achieves outstanding performance on unconditional and topology-constrained protein monomer design, protein binder design, symmetric oligomer design, enzyme active site scaffolding and symmetric motif scaffolding for therapeutic and metal-binding protein design. We demonstrate the power and generality of the method, called RoseTTAFold diffusion (RFdiffusion), by experimentally characterizing the structures and functions of hundreds of designed symmetric assemblies, metal-binding proteins and protein binders. The accuracy of RFdiffusion is confirmed by the cryogenic electron microscopy structure of a designed binder in complex with influenza haemagglutinin that is nearly identical to the design model. In a manner analogous to networks that produce images from user-specified inputs, RFdiffusion enables the design of diverse functional proteins from simple molecular specifications.
最近在使用深度学习方法设计新蛋白质方面取得了相当大的进展。尽管取得了这一进展,但仍需要描述一个通用的蛋白质设计深度学习框架,以解决包括从头设计结合剂和设计更高阶对称结构在内的广泛设计挑战。扩散模型在图像和语言生成建模方面取得了相当大的成功,但在蛋白质建模方面的应用却收效甚微,这可能是由于蛋白质骨架几何形状和序列-结构关系的复杂性所致。在这里,我们表明,通过对蛋白质结构去噪任务进行微调,我们获得了一种蛋白质骨架的生成模型,该模型在无条件和拓扑约束的蛋白质单体设计、蛋白质结合剂设计、对称低聚物设计、酶活性位点支架和治疗性和金属结合蛋白设计的对称基序支架方面取得了出色的性能。我们通过实验表征数百个设计的对称组装体、金属结合蛋白和蛋白质结合剂的结构和功能,证明了该方法的强大功能和通用性,该方法称为 RoseTTAFold 扩散 (RFdiffusion)。RFdiffusion 的准确性通过与设计模型几乎相同的与流感血凝素复合物的设计结合剂的低温电子显微镜结构得到了确认。类似于从用户指定输入生成图像的网络,RFdiffusion 能够根据简单的分子规范设计出各种功能蛋白。