端到端可微分蛋白质结构学习

End-to-End Differentiable Learning of Protein Structure.

机构信息

Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA 02115, USA; Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA.

出版信息

Cell Syst. 2019 Apr 24;8(4):292-301.e3. doi: 10.1016/j.cels.2019.03.006. Epub 2019 Apr 17.

DOI:10.1016/j.cels.2019.03.006

PMID:31005579

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6513320/

Abstract

Predicting protein structure from sequence is a central challenge of biochemistry. Co-evolution methods show promise, but an explicit sequence-to-structure map remains elusive. Advances in deep learning that replace complex, human-designed pipelines with differentiable models optimized end to end suggest the potential benefits of similarly reformulating structure prediction. Here, we introduce an end-to-end differentiable model for protein structure learning. The model couples local and global protein structure via geometric units that optimize global geometry without violating local covalent chemistry. We test our model using two challenging tasks: predicting novel folds without co-evolutionary data and predicting known folds without structural templates. In the first task, the model achieves state-of-the-art accuracy, and in the second, it comes within 1-2 Å; competing methods using co-evolution and experimental templates have been refined over many years, and it is likely that the differentiable approach has substantial room for further improvement, with applications ranging from drug discovery to protein design.

摘要

从序列预测蛋白质结构是生物化学的核心挑战。共进化方法显示出前景，但明确的序列到结构的映射仍然难以捉摸。深度学习的进步用可微分的模型替代了复杂的、人工设计的流水线，并进行端到端优化，这表明类似地重新制定结构预测具有潜在的好处。在这里，我们引入了一个用于蛋白质结构学习的端到端可微分模型。该模型通过几何单元来耦合局部和全局蛋白质结构，这些几何单元在不违反局部共价化学的情况下优化全局几何形状。我们使用两个具有挑战性的任务来测试我们的模型：在没有共进化数据的情况下预测新的折叠结构，以及在没有结构模板的情况下预测已知的折叠结构。在第一个任务中，该模型达到了最先进的准确性，在第二个任务中，它的误差在 1-2Å 以内；使用共进化和实验模板的竞争方法已经经过多年的改进，因此可微分方法很可能还有很大的改进空间，其应用范围从药物发现到蛋白质设计。

相似文献

End-to-End Differentiable Learning of Protein Structure.端到端可微分蛋白质结构学习

Cell Syst. 2019 Apr 24;8(4):292-301.e3. doi: 10.1016/j.cels.2019.03.006. Epub 2019 Apr 17.

Protein sequence-to-structure learning: Is this the end(-to-end revolution)?蛋白质序列到结构的学习：这是（端到端革命）的终结吗？

Proteins. 2021 Dec;89(12):1770-1786. doi: 10.1002/prot.26235. Epub 2021 Sep 22.

Toward the solution of the protein structure prediction problem.朝着解决蛋白质结构预测问题的方向努力。

J Biol Chem. 2021 Jul;297(1):100870. doi: 10.1016/j.jbc.2021.100870. Epub 2021 Jun 11.

State-of-the-art web services for de novo protein structure prediction.用于从头蛋白质结构预测的先进网络服务。

Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa139.

Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model.基于超深度学习模型的蛋白质接触图从头精确预测

PLoS Comput Biol. 2017 Jan 5;13(1):e1005324. doi: 10.1371/journal.pcbi.1005324. eCollection 2017 Jan.

Machine learning in protein structure prediction.机器学习在蛋白质结构预测中的应用。

Curr Opin Chem Biol. 2021 Dec;65:1-8. doi: 10.1016/j.cbpa.2021.04.005. Epub 2021 May 18.

Illuminating the "Twilight Zone": Advances in Difficult Protein Modeling.阐明“混沌地带”：困难蛋白建模的进展。

Methods Mol Biol. 2023;2627:25-40. doi: 10.1007/978-1-0716-2974-1_2.

RPITER: A Hierarchical Deep Learning Framework for ncRNA⁻Protein Interaction Prediction.RPITER：一种用于 ncRNA-蛋白质相互作用预测的分层深度学习框架。

Int J Mol Sci. 2019 Mar 1;20(5):1070. doi: 10.3390/ijms20051070.

Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13.基于深度学习的蛋白质三级结构建模和 CASP13 中的接触距离预测。

Proteins. 2019 Dec;87(12):1165-1178. doi: 10.1002/prot.25697. Epub 2019 Apr 25.

DeepLoc: prediction of protein subcellular localization using deep learning.DeepLoc：使用深度学习进行蛋白质亚细胞定位预测。

Bioinformatics. 2017 Nov 1;33(21):3387-3395. doi: 10.1093/bioinformatics/btx431.

引用本文的文献

Identifying Cortical Molecular Biomarkers Potentially Associated with Learning in Mice Using Artificial Intelligence.利用人工智能识别小鼠中可能与学习相关的皮质分子生物标志物。

Int J Mol Sci. 2025 Jul 17;26(14):6878. doi: 10.3390/ijms26146878.

Chemosensory Receptors in Vertebrates: Structure and Computational Modeling Insights.脊椎动物的化学感受器：结构与计算建模见解

Int J Mol Sci. 2025 Jul 10;26(14):6605. doi: 10.3390/ijms26146605.

Role of artificial intelligence in revolutionizing drug discovery.人工智能在变革药物研发中的作用。

Fundam Res. 2024 May 9;5(3):1273-1287. doi: 10.1016/j.fmre.2024.04.021. eCollection 2025 May.

Multimeric protein interaction and complex prediction: Structure, dynamics and function.多聚体蛋白质相互作用与复合物预测：结构、动力学与功能

Comput Struct Biotechnol J. 2025 May 16;27:1975-1997. doi: 10.1016/j.csbj.2025.05.009. eCollection 2025.

Advances in artificial intelligence-based technologies for increasing the quality of medical products.基于人工智能的技术在提高医疗产品质量方面的进展。

Daru. 2024 Nov 30;33(1):1. doi: 10.1007/s40199-024-00548-5.

Beyond AlphaFold2: The Impact of AI for the Further Improvement of Protein Structure Prediction.超越 AlphaFold2：人工智能对进一步改进蛋白质结构预测的影响。

Methods Mol Biol. 2025;2867:121-139. doi: 10.1007/978-1-0716-4196-5_7.

Trends of Artificial Intelligence (AI) Use in Drug Targets, Discovery and Development: Current Status and Future Perspectives.人工智能在药物靶点、发现与开发中的应用趋势：现状与未来展望

Curr Drug Targets. 2025;26(4):221-242. doi: 10.2174/0113894501322734241008163304.

How the technologies behind self-driving cars, social networks, ChatGPT, and DALL-E2 are changing structural biology.自动驾驶汽车、社交网络、ChatGPT和DALL-E2背后的技术如何正在改变结构生物学。

Bioessays. 2025 Jan;47(1):e2400155. doi: 10.1002/bies.202400155. Epub 2024 Oct 15.

MCNN_MC: Computational Prediction of Mitochondrial Carriers and Investigation of Bongkrekic Acid Toxicity Using Protein Language Models and Convolutional Neural Networks.MCNN_MC：利用蛋白质语言模型和卷积神经网络对线粒体载体进行计算预测并研究米酵菌酸毒性

J Chem Inf Model. 2024 Dec 23;64(24):9125-9134. doi: 10.1021/acs.jcim.4c00961. Epub 2024 Aug 12.

Structure-based protein and small molecule generation using EGNN and diffusion models: A comprehensive review.使用基于图神经网络（EGNN）和扩散模型的基于结构的蛋白质和小分子生成：全面综述。

Comput Struct Biotechnol J. 2024 Jun 26;23:2779-2797. doi: 10.1016/j.csbj.2024.06.021. eCollection 2024 Dec.

本文引用的文献

ProteinNet: a standardized data set for machine learning of protein structure.ProteinNet：用于蛋白质结构机器学习的标准化数据集。

BMC Bioinformatics. 2019 Jun 11;20(1):311. doi: 10.1186/s12859-019-2932-0.

Parallelized Natural Extension Reference Frame: Parallelized Conversion from Internal to Cartesian Coordinates.并行自然扩展参考系：从内部坐标到笛卡尔坐标的并行转换。

J Comput Chem. 2019 Mar 15;40(7):885-892. doi: 10.1002/jcc.25772. Epub 2019 Jan 7.

RaptorX-Angle: real-value prediction of protein backbone dihedral angles through a hybrid method of clustering and deep learning. RaptorX-Angle：通过聚类和深度学习的混合方法实现蛋白质主链二面角的实值预测。

BMC Bioinformatics. 2018 May 8;19(Suppl 4):100. doi: 10.1186/s12859-018-2065-x.

Enhancing Evolutionary Couplings with Deep Convolutional Neural Networks.利用深度卷积神经网络增强进化耦合。

Cell Syst. 2018 Jan 24;6(1):65-74.e3. doi: 10.1016/j.cels.2017.11.014. Epub 2017 Dec 20.

Evaluation of the template-based modeling in CASP12.在蛋白质结构预测关键评估第12轮（CASP12）中基于模板的建模评估。

Proteins. 2018 Mar;86 Suppl 1(Suppl 1):321-334. doi: 10.1002/prot.25425. Epub 2017 Dec 4.

Critical assessment of methods of protein structure prediction (CASP)-Round XII.蛋白质结构预测方法的关键评估（CASP）——第十二轮。

Proteins. 2018 Mar;86 Suppl 1(Suppl 1):7-15. doi: 10.1002/prot.25415. Epub 2017 Dec 15.

Template-based and free modeling of I-TASSER and QUARK pipelines using predicted contact maps in CASP12.在蛋白质结构预测技术评估第12轮（CASP12）中，基于模板以及I-TASSER和QUARK流程的自由建模，并使用预测的接触图。

Proteins. 2018 Mar;86 Suppl 1(Suppl 1):136-151. doi: 10.1002/prot.25414. Epub 2017 Nov 14.

Assessment of contact predictions in CASP12: Co-evolution and deep learning coming of age.蛋白质结构预测技术关键评估第12轮（CASP12）中的接触预测评估：协同进化与深度学习走向成熟。

Proteins. 2018 Mar;86 Suppl 1(Suppl Suppl 1):51-66. doi: 10.1002/prot.25407. Epub 2017 Nov 7.

Biological and functional relevance of CASP predictions.半胱天冬酶（CASP）预测的生物学及功能相关性。

Proteins. 2018 Mar;86 Suppl 1(Suppl Suppl 1):374-386. doi: 10.1002/prot.25396. Epub 2017 Oct 17.

Deep learning methods for protein torsion angle prediction.用于蛋白质扭转角预测的深度学习方法。

BMC Bioinformatics. 2017 Sep 18;18(1):417. doi: 10.1186/s12859-017-1834-2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验