深度学习利用迭代预测的结构约束来扩展从头开始的蛋白质建模对基因组的覆盖范围。

Deep learning extends de novo protein modelling coverage of genomes using iteratively predicted structural constraints.

机构信息

Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, UK.

The Francis Crick Institute, 1 Midland Road, London, NW1 1AT, UK.

出版信息

Nat Commun. 2019 Sep 4;10(1):3977. doi: 10.1038/s41467-019-11994-0.

DOI:10.1038/s41467-019-11994-0

PMID:31484923

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6726615/

Abstract

The inapplicability of amino acid covariation methods to small protein families has limited their use for structural annotation of whole genomes. Recently, deep learning has shown promise in allowing accurate residue-residue contact prediction even for shallow sequence alignments. Here we introduce DMPfold, which uses deep learning to predict inter-atomic distance bounds, the main chain hydrogen bond network, and torsion angles, which it uses to build models in an iterative fashion. DMPfold produces more accurate models than two popular methods for a test set of CASP12 domains, and works just as well for transmembrane proteins. Applied to all Pfam domains without known structures, confident models for 25% of these so-called dark families were produced in under a week on a small 200 core cluster. DMPfold provides models for 16% of human proteome UniProt entries without structures, generates accurate models with fewer than 100 sequences in some cases, and is freely available.

摘要

氨基酸共变方法在小蛋白家族中的不适用性限制了它们在整个基因组结构注释中的应用。最近，深度学习在允许进行准确的残基-残基接触预测方面显示出了潜力，即使是在浅层序列比对的情况下。在这里，我们介绍了 DMPfold，它使用深度学习来预测原子间距离边界、主链氢键网络和扭转角，然后它可以使用这些信息以迭代的方式构建模型。在 CASP12 结构域的测试集中，DMPfold 产生的模型比两种流行的方法更准确，并且对跨膜蛋白也同样有效。应用于所有没有已知结构的 Pfam 结构域，在一个小型的 200 核集群上，在不到一周的时间内，就可以为 25%的所谓暗家族生成有信心的模型。DMPfold 为 16%没有结构的人类蛋白质组 UniProt 条目提供模型，在某些情况下，即使只有不到 100 个序列，也能生成准确的模型，而且它是免费提供的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0ad/6726615/0b452c5799b5/41467_2019_11994_Fig1_HTML.jpg

相似文献

Deep learning extends de novo protein modelling coverage of genomes using iteratively predicted structural constraints.深度学习利用迭代预测的结构约束来扩展从头开始的蛋白质建模对基因组的覆盖范围。

Nat Commun. 2019 Sep 4;10(1):3977. doi: 10.1038/s41467-019-11994-0.

Protein contact prediction by integrating deep multiple sequence alignments, coevolution and machine learning.通过整合深度多序列比对、协同进化和机器学习进行蛋白质接触预测。

Proteins. 2018 Mar;86 Suppl 1(Suppl 1):84-96. doi: 10.1002/prot.25405. Epub 2017 Oct 31.

DeepCDpred: Inter-residue distance and contact prediction for improved prediction of protein structure.DeepCDpred：用于改进蛋白质结构预测的残差间距离和接触预测。

PLoS One. 2019 Jan 8;14(1):e0205214. doi: 10.1371/journal.pone.0205214. eCollection 2019.

Using deep learning to annotate the protein universe.利用深度学习标注蛋白质宇宙。

Nat Biotechnol. 2022 Jun;40(6):932-937. doi: 10.1038/s41587-021-01179-w. Epub 2022 Feb 21.

Highly accurate protein structure prediction for the human proteome.高精准度的人类蛋白质组蛋白结构预测。

Nature. 2021 Aug;596(7873):590-596. doi: 10.1038/s41586-021-03828-1. Epub 2021 Jul 22.

Automated prediction of domain boundaries in CASP6 targets using Ginzu and RosettaDOM.使用Ginzu和RosettaDOM自动预测CASP6目标中的结构域边界。

Proteins. 2005;61 Suppl 7:193-200. doi: 10.1002/prot.20737.

Complete fold annotation of the human proteome using a novel structural feature space.利用新型结构特征空间完成人类蛋白质组的完全折叠注释。

Sci Rep. 2017 Apr 13;7:46321. doi: 10.1038/srep46321.

Analysis of deep learning methods for blind protein contact prediction in CASP12.CASP12中用于蛋白质盲态接触预测的深度学习方法分析

Proteins. 2018 Mar;86 Suppl 1(Suppl 1):67-77. doi: 10.1002/prot.25377. Epub 2017 Sep 6.

Pfam: The protein families database in 2021.Pfam：2021 年的蛋白质家族数据库。

Nucleic Acids Res. 2021 Jan 8;49(D1):D412-D419. doi: 10.1093/nar/gkaa913.

Using Attention-UNet Models to Predict Protein Contact Maps.使用注意力 U-Net 模型预测蛋白质接触图谱。

J Comput Biol. 2024 Jul;31(7):691-702. doi: 10.1089/cmb.2023.0102. Epub 2024 Jul 9.

引用本文的文献

Modeling protein conformational ensembles by guiding AlphaFold2 with Double Electron Electron Resonance (DEER) distance distributions.通过双电子电子共振（DEER）距离分布引导AlphaFold2对蛋白质构象集合进行建模。

Nat Commun. 2025 Aug 2;16(1):7107. doi: 10.1038/s41467-025-62582-4.

Deep-learning-based single-domain and multidomain protein structure prediction with D-I-TASSER.基于深度学习的单域和多域蛋白质结构预测与D-I-TASSER

Nat Biotechnol. 2025 May 23. doi: 10.1038/s41587-025-02654-4.

Rerouting therapeutic peptides and unlocking their potential against SARS-CoV2.重新规划治疗性肽并释放其对抗新冠病毒的潜力。

3 Biotech. 2025 May;15(5):116. doi: 10.1007/s13205-025-04270-0. Epub 2025 Apr 4.

Phyre2.2: A Community Resource for Template-based Protein Structure Prediction.Phyre2.2：基于模板的蛋白质结构预测的社区资源。

J Mol Biol. 2025 Jan 23:168960. doi: 10.1016/j.jmb.2025.168960.

The State-of-the-Art Overview to Application of Deep Learning in Accurate Protein Design and Structure Prediction.深度学习在精确蛋白质设计和结构预测中的应用综述

Top Curr Chem (Cham). 2024 Jul 4;382(3):23. doi: 10.1007/s41061-024-00469-6.

Predicting therapeutic and side effects from drug binding affinities to human proteome structures.从药物与人蛋白质组结构的结合亲和力预测治疗效果和副作用。

iScience. 2024 May 20;27(6):110032. doi: 10.1016/j.isci.2024.110032. eCollection 2024 Jun 21.

Hippo and PI5P4K signaling intersect to control the transcriptional activation of YAP.Hippo 和 PI5P4K 信号通路相互作用，共同控制 YAP 的转录激活。

Sci Signal. 2024 May 28;17(838):eado6266. doi: 10.1126/scisignal.ado6266.

Deep learning for the PSIPRED Protein Analysis Workbench.深度学习在 PSIPRED 蛋白质分析工作台上的应用。

Nucleic Acids Res. 2024 Jul 5;52(W1):W287-W293. doi: 10.1093/nar/gkae328.

Importance of Inter-residue Contacts for Understanding Protein Folding and Unfolding Rates, Remote Homology, and Drug Design.残基间接触对于理解蛋白质折叠与解折叠速率、远程同源性及药物设计的重要性。

Mol Biotechnol. 2025 Mar;67(3):862-884. doi: 10.1007/s12033-024-01119-4. Epub 2024 Mar 18.

Agrobacteria deploy two classes of His-Me finger superfamily nuclease effectors exerting different antibacterial capacities against specific bacterial competitors.农杆菌会部署两类组氨酸-甲硫氨酸指状超家族核酸酶效应蛋白，它们对特定的细菌竞争者具有不同的抗菌能力。

Front Microbiol. 2024 Feb 14;15:1351590. doi: 10.3389/fmicb.2024.1351590. eCollection 2024.

本文引用的文献

Distance-based protein folding powered by deep learning.基于深度学习的距离相关蛋白质折叠。

Proc Natl Acad Sci U S A. 2019 Aug 20;116(34):16856-16865. doi: 10.1073/pnas.1821309116. Epub 2019 Aug 9.

Prediction of interresidue contacts with DeepMetaPSICOV in CASP13.在 CASP13 中使用 DeepMetaPSICOV 预测残基间接触。

Proteins. 2019 Dec;87(12):1092-1099. doi: 10.1002/prot.25779. Epub 2019 Jul 27.

End-to-End Differentiable Learning of Protein Structure.端到端可微分蛋白质结构学习

Cell Syst. 2019 Apr 24;8(4):292-301.e3. doi: 10.1016/j.cels.2019.03.006. Epub 2019 Apr 17.

PconsFam: An Interactive Database of Structure Predictions of Pfam Families.PconsFam：Pfam 家族结构预测的交互式数据库。

J Mol Biol. 2019 Jun 14;431(13):2442-2448. doi: 10.1016/j.jmb.2019.01.047. Epub 2019 Feb 21.

The Pfam protein families database in 2019.2019 年 Pfam 蛋白质家族数据库。

Nucleic Acids Res. 2019 Jan 8;47(D1):D427-D432. doi: 10.1093/nar/gky995.

HMMER web server: 2018 update.HMMER 网页服务器：2018 年更新。

Nucleic Acids Res. 2018 Jul 2;46(W1):W200-W204. doi: 10.1093/nar/gky448.

High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features.利用全卷积神经网络和最小序列特征进行高精度蛋白质接触预测。

Bioinformatics. 2018 Oct 1;34(19):3308-3315. doi: 10.1093/bioinformatics/bty341.

UniProt: the universal protein knowledgebase.通用蛋白质知识库：UniProt

Nucleic Acids Res. 2018 Mar 16;46(5):2699. doi: 10.1093/nar/gky092.

CONFOLD2: improved contact-driven ab initio protein structure modeling.CONFOLD2：改进的接触驱动从头蛋白质结构建模。

BMC Bioinformatics. 2018 Jan 25;19(1):22. doi: 10.1186/s12859-018-2032-6.

A Completely Reimplemented MPI Bioinformatics Toolkit with a New HHpred Server at its Core.一个完全重新实现的 MPI 生物信息学工具包，其核心是一个新的 HHpred 服务器。

J Mol Biol. 2018 Jul 20;430(15):2237-2243. doi: 10.1016/j.jmb.2017.12.007. Epub 2017 Dec 16.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

深度学习利用迭代预测的结构约束来扩展从头开始的蛋白质建模对基因组的覆盖范围。

Deep learning extends de novo protein modelling coverage of genomes using iteratively predicted structural constraints.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献