Suppr超能文献

基于去噪扩散网络的从头蛋白质设计,无需预先训练的结构预测模型。

De novo protein design with a denoising diffusion network independent of pretrained structure prediction models.

机构信息

Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, Hefei National Research Center for Physical Sciences at the Microscale, Center for Advanced Interdisciplinary Science and Biomedicine of IHM, University of Science and Technology of China, Hefei, China.

MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China.

出版信息

Nat Methods. 2024 Nov;21(11):2107-2116. doi: 10.1038/s41592-024-02437-w. Epub 2024 Oct 9.

Abstract

The recent success of RFdiffusion, a method for protein structure design with a denoising diffusion probabilistic model, has relied on fine-tuning the RoseTTAFold structure prediction network for protein backbone denoising. Here, we introduce SCUBA-diffusion (SCUBA-D), a protein backbone denoising diffusion probabilistic model freshly trained by considering co-diffusion of sequence representation to enhance model regularization and adversarial losses to minimize data-out-of-distribution errors. While matching the performance of the pretrained RoseTTAFold-based RFdiffusion in generating experimentally realizable protein structures, SCUBA-D readily generates protein structures with not-yet-observed overall folds that are different from those predictable with RoseTTAFold. The accuracy of SCUBA-D was confirmed by the X-ray structures of 16 designed proteins and a protein complex, and by experiments validating designed heme-binding proteins and Ras-binding proteins. Our work shows that deep generative models of images or texts can be fruitfully extended to complex physical objects like protein structures by addressing outstanding issues such as the data-out-of-distribution errors.

摘要

最近,RFdiffusion 方法在蛋白质结构设计方面取得了成功,该方法使用去噪扩散概率模型对蛋白质结构进行预测。RFdiffusion 方法成功的关键在于对 RoseTTAFold 结构预测网络进行微调,以实现对蛋白质主链的去噪。在这项研究中,我们引入了 SCUBA-diffusion(SCUBA-D),这是一种全新的蛋白质主链去噪扩散概率模型,它通过考虑序列表示的共同扩散来增强模型正则化,并通过对抗损失来最小化数据分布外误差。虽然 SCUBA-D 在生成可实验实现的蛋白质结构方面的性能与基于预训练的 RoseTTAFold 的 RFdiffusion 相当,但 SCUBA-D 可以轻松生成尚未观察到的具有全新整体折叠的蛋白质结构,这些结构是无法用 RoseTTAFold 预测的。通过对 16 个设计蛋白质和一个蛋白质复合物的 X 射线结构进行验证,以及对设计的血红素结合蛋白和 Ras 结合蛋白进行实验验证,证实了 SCUBA-D 的准确性。我们的工作表明,通过解决数据分布外误差等突出问题,深度生成模型可以成功地扩展到像蛋白质结构这样的复杂物理对象。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验