• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于超深度学习模型的蛋白质接触图从头精确预测

Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model.

作者信息

Wang Sheng, Sun Siqi, Li Zhen, Zhang Renyu, Xu Jinbo

机构信息

Toyota Technological Institute at Chicago, Chicago, Illinois, United States of America.

出版信息

PLoS Comput Biol. 2017 Jan 5;13(1):e1005324. doi: 10.1371/journal.pcbi.1005324. eCollection 2017 Jan.

DOI:10.1371/journal.pcbi.1005324
PMID:28056090
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5249242/
Abstract

MOTIVATION

Protein contacts contain key information for the understanding of protein structure and function and thus, contact prediction from sequence is an important problem. Recently exciting progress has been made on this problem, but the predicted contacts for proteins without many sequence homologs is still of low quality and not very useful for de novo structure prediction.

METHOD

This paper presents a new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual neural networks. The first residual network conducts a series of 1-dimensional convolutional transformation of sequential features; the second residual network conducts a series of 2-dimensional convolutional transformation of pairwise information including output of the first residual network, EC information and pairwise potential. By using very deep residual networks, we can accurately model contact occurrence patterns and complex sequence-structure relationship and thus, obtain higher-quality contact prediction regardless of how many sequence homologs are available for proteins in question.

RESULTS

Our method greatly outperforms existing methods and leads to much more accurate contact-assisted folding. Tested on 105 CASP11 targets, 76 past CAMEO hard targets, and 398 membrane proteins, the average top L long-range prediction accuracy obtained by our method, one representative EC method CCMpred and the CASP11 winner MetaPSICOV is 0.47, 0.21 and 0.30, respectively; the average top L/10 long-range accuracy of our method, CCMpred and MetaPSICOV is 0.77, 0.47 and 0.59, respectively. Ab initio folding using our predicted contacts as restraints but without any force fields can yield correct folds (i.e., TMscore>0.6) for 203 of the 579 test proteins, while that using MetaPSICOV- and CCMpred-predicted contacts can do so for only 79 and 62 of them, respectively. Our contact-assisted models also have much better quality than template-based models especially for membrane proteins. The 3D models built from our contact prediction have TMscore>0.5 for 208 of the 398 membrane proteins, while those from homology modeling have TMscore>0.5 for only 10 of them. Further, even if trained mostly by soluble proteins, our deep learning method works very well on membrane proteins. In the recent blind CAMEO benchmark, our fully-automated web server implementing this method successfully folded 6 targets with a new fold and only 0.3L-2.3L effective sequence homologs, including one β protein of 182 residues, one α+β protein of 125 residues, one α protein of 140 residues, one α protein of 217 residues, one α/β of 260 residues and one α protein of 462 residues. Our method also achieved the highest F1 score on free-modeling targets in the latest CASP (Critical Assessment of Structure Prediction), although it was not fully implemented back then.

AVAILABILITY

http://raptorx.uchicago.edu/ContactMap/.

摘要

动机

蛋白质接触包含理解蛋白质结构和功能的关键信息,因此,从序列预测接触是一个重要问题。最近在这个问题上取得了令人兴奋的进展,但对于没有许多序列同源物的蛋白质,预测的接触质量仍然很低,对从头结构预测不太有用。

方法

本文提出了一种新的深度学习方法,该方法通过由两个深度残差神经网络组成的超深度神经网络整合进化耦合(EC)和序列保守信息来预测接触。第一个残差网络对序列特征进行一系列一维卷积变换;第二个残差网络对包括第一个残差网络的输出、EC信息和成对势在内的成对信息进行一系列二维卷积变换。通过使用非常深的残差网络,我们可以准确地对接触出现模式和复杂的序列-结构关系进行建模,从而无论所讨论的蛋白质有多少序列同源物,都能获得更高质量的接触预测。

结果

我们的方法大大优于现有方法,并导致更准确的接触辅助折叠。在105个CASP11目标、76个过去的CAMEO硬目标和398个膜蛋白上进行测试,我们的方法、一种代表性的EC方法CCMpred和CASP11获胜者MetaPSICOV获得的平均前L个长程预测准确率分别为0.47、0.21和0.30;我们的方法、CCMpred和MetaPSICOV的平均前L/10长程准确率分别为0.77、0.47和0.59。使用我们预测的接触作为约束但不使用任何力场的从头折叠可以为579个测试蛋白中的203个产生正确的折叠(即TMscore>0.6),而使用MetaPSICOV和CCMpred预测的接触分别只能为其中的79个和62个产生正确折叠。我们的接触辅助模型的质量也比基于模板的模型好得多,特别是对于膜蛋白。从我们的接触预测构建的3D模型在398个膜蛋白中有208个的TMscore>0.5,而同源建模构建的3D模型只有10个的TMscore>0.5。此外,即使主要由可溶性蛋白训练,我们的深度学习方法在膜蛋白上也表现得非常好。在最近的盲CAMEO基准测试中,我们实现此方法的全自动网络服务器成功折叠了6个具有新折叠且有效序列同源物仅为0.3L - 2.3L的目标,包括一个182个残基的β蛋白、一个125个残基的α + β蛋白、一个140个残基的α蛋白、一个217个残基的α蛋白、一个260个残基的α/β蛋白和一个462个残基的α蛋白。我们的方法在最新的CASP(结构预测关键评估)中的自由建模目标上也获得了最高的F1分数,尽管当时它没有完全实现。

可用性

http://raptorx.uchicago.edu/ContactMap/

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/02297e3de24a/pcbi.1005324.g022.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/c3b194d6515f/pcbi.1005324.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/12f49f5e5b86/pcbi.1005324.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/5dd9d69c3f8e/pcbi.1005324.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/652b89f52556/pcbi.1005324.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/5da76982bb16/pcbi.1005324.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/de98e0c12150/pcbi.1005324.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/08e7f4343158/pcbi.1005324.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/2813c13df62d/pcbi.1005324.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/8f47a7a0479f/pcbi.1005324.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/fe50e2ae4518/pcbi.1005324.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/65e63a1b1652/pcbi.1005324.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/26f4964ada9a/pcbi.1005324.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/14cafbaab676/pcbi.1005324.g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/a3bc55970f33/pcbi.1005324.g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/ad26a711836d/pcbi.1005324.g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/273ad36882fb/pcbi.1005324.g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/bbc8d7cd9c34/pcbi.1005324.g017.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/76c08ed51ac6/pcbi.1005324.g018.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/af7b000d5eeb/pcbi.1005324.g019.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/be93395b0639/pcbi.1005324.g020.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/3c55e27d416e/pcbi.1005324.g021.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/02297e3de24a/pcbi.1005324.g022.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/c3b194d6515f/pcbi.1005324.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/12f49f5e5b86/pcbi.1005324.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/5dd9d69c3f8e/pcbi.1005324.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/652b89f52556/pcbi.1005324.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/5da76982bb16/pcbi.1005324.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/de98e0c12150/pcbi.1005324.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/08e7f4343158/pcbi.1005324.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/2813c13df62d/pcbi.1005324.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/8f47a7a0479f/pcbi.1005324.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/fe50e2ae4518/pcbi.1005324.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/65e63a1b1652/pcbi.1005324.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/26f4964ada9a/pcbi.1005324.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/14cafbaab676/pcbi.1005324.g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/a3bc55970f33/pcbi.1005324.g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/ad26a711836d/pcbi.1005324.g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/273ad36882fb/pcbi.1005324.g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/bbc8d7cd9c34/pcbi.1005324.g017.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/76c08ed51ac6/pcbi.1005324.g018.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/af7b000d5eeb/pcbi.1005324.g019.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/be93395b0639/pcbi.1005324.g020.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/3c55e27d416e/pcbi.1005324.g021.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e99/5249242/02297e3de24a/pcbi.1005324.g022.jpg

相似文献

1
Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model.基于超深度学习模型的蛋白质接触图从头精确预测
PLoS Comput Biol. 2017 Jan 5;13(1):e1005324. doi: 10.1371/journal.pcbi.1005324. eCollection 2017 Jan.
2
Analysis of distance-based protein structure prediction by deep learning in CASP13.基于深度学习的 CASP13 蛋白质结构预测距离分析。
Proteins. 2019 Dec;87(12):1069-1081. doi: 10.1002/prot.25810. Epub 2019 Sep 13.
3
MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins.MetaPSICOV:结合协同进化方法用于精确预测蛋白质中的接触和长程氢键
Bioinformatics. 2015 Apr 1;31(7):999-1006. doi: 10.1093/bioinformatics/btu791. Epub 2014 Nov 26.
4
DNCON2: improved protein contact prediction using two-level deep convolutional neural networks.DNCON2:使用两级深度卷积神经网络改进蛋白质接触预测。
Bioinformatics. 2018 May 1;34(9):1466-1472. doi: 10.1093/bioinformatics/btx781.
5
Analysis of deep learning methods for blind protein contact prediction in CASP12.CASP12中用于蛋白质盲态接触预测的深度学习方法分析
Proteins. 2018 Mar;86 Suppl 1(Suppl 1):67-77. doi: 10.1002/prot.25377. Epub 2017 Sep 6.
6
Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13.基于深度学习的蛋白质三级结构建模和 CASP13 中的接触距离预测。
Proteins. 2019 Dec;87(12):1165-1178. doi: 10.1002/prot.25697. Epub 2019 Apr 25.
7
CoinFold: a web server for protein contact prediction and contact-assisted protein folding.CoinFold:用于蛋白质接触预测和接触辅助蛋白质折叠的网络服务器。
Nucleic Acids Res. 2016 Jul 8;44(W1):W361-6. doi: 10.1093/nar/gkw307. Epub 2016 Apr 25.
8
Protein contact prediction by integrating joint evolutionary coupling analysis and supervised learning.基于联合进化耦合分析和监督学习的蛋白质接触预测。
Bioinformatics. 2015 Nov 1;31(21):3506-13. doi: 10.1093/bioinformatics/btv472. Epub 2015 Aug 14.
9
Protein contact prediction by integrating deep multiple sequence alignments, coevolution and machine learning.通过整合深度多序列比对、协同进化和机器学习进行蛋白质接触预测。
Proteins. 2018 Mar;86 Suppl 1(Suppl 1):84-96. doi: 10.1002/prot.25405. Epub 2017 Oct 31.
10
Detecting distant-homology protein structures by aligning deep neural-network based contact maps.通过对齐基于深度神经网络的接触图来检测远程同源蛋白结构。
PLoS Comput Biol. 2019 Oct 17;15(10):e1007411. doi: 10.1371/journal.pcbi.1007411. eCollection 2019 Oct.

引用本文的文献

1
Microbiome-Immune Interaction and Harnessing for Next-Generation Vaccines Against Highly Pathogenic Avian Influenza in Poultry.微生物群-免疫相互作用及其在开发家禽高致病性禽流感下一代疫苗中的应用
Vaccines (Basel). 2025 Aug 6;13(8):837. doi: 10.3390/vaccines13080837.
2
Comprehensive Molecular Profiling of AcrAB-TolC Efflux Pump Genes in Salmonella typhi Isolates from Typhoid Infected Patients.伤寒感染患者分离出的伤寒沙门氏菌中AcrAB-TolC外排泵基因的综合分子分析
Curr Microbiol. 2025 Aug 22;82(10):470. doi: 10.1007/s00284-025-04460-2.
3
AlphaFold 3: an unprecedent opportunity for fundamental research and drug development.

本文引用的文献

1
Structure of the Shroom-Rho Kinase Complex Reveals a Binding Interface with Monomeric Shroom That Regulates Cell Morphology and Stimulates Kinase Activity.Shroom- Rho激酶复合物的结构揭示了与单体Shroom的结合界面,该界面调节细胞形态并刺激激酶活性。
J Biol Chem. 2016 Dec 2;291(49):25364-25374. doi: 10.1074/jbc.M116.738559. Epub 2016 Oct 10.
2
Critical assessment of methods of protein structure prediction: Progress and new directions in round XI.蛋白质结构预测方法的批判性评估:第十一轮的进展与新方向
Proteins. 2016 Sep;84 Suppl 1(Suppl 1):4-14. doi: 10.1002/prot.25064. Epub 2016 Jun 1.
3
RaptorX-Property: a web server for protein structure property prediction.
阿尔法折叠3:基础研究和药物开发的前所未有的机遇。
Precis Clin Med. 2025 Jul 1;8(3):pbaf015. doi: 10.1093/pcmedi/pbaf015. eCollection 2025 Sep.
4
Modeling protein conformational ensembles by guiding AlphaFold2 with Double Electron Electron Resonance (DEER) distance distributions.通过双电子电子共振(DEER)距离分布引导AlphaFold2对蛋白质构象集合进行建模。
Nat Commun. 2025 Aug 2;16(1):7107. doi: 10.1038/s41467-025-62582-4.
5
Integration of proteomics and bioinformatics in traumatic brain injury biomarker discovery.蛋白质组学与生物信息学在创伤性脑损伤生物标志物发现中的整合
BioTechnologia (Pozn). 2025 Jun 30;106(2):123-150. doi: 10.5114/bta/202470. eCollection 2025.
6
Artificial intelligence and first-principle methods in protein redesign: A marriage of convenience?蛋白质重新设计中的人工智能与第一性原理方法:权宜之计的结合?
Protein Sci. 2025 Aug;34(8):e70210. doi: 10.1002/pro.70210.
7
Beyond static structures: protein dynamic conformations modeling in the post-AlphaFold era.超越静态结构:后AlphaFold时代的蛋白质动态构象建模
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf340.
8
Empirical evidence that glucan-interacting amino acid side chains within the transmembrane channel collectively facilitate cellulose synthase function.实验证据表明,跨膜通道内与葡聚糖相互作用的氨基酸侧链共同促进纤维素合酶的功能。
Plant Mol Biol. 2025 Jul 9;115(4):85. doi: 10.1007/s11103-025-01615-4.
9
GOBoost: leveraging long-tail gene ontology terms for accurate protein function prediction.GOBoost:利用长尾基因本体术语进行准确的蛋白质功能预测。
Bioinformatics. 2025 Jun 2;41(6). doi: 10.1093/bioinformatics/btaf267.
10
Multimeric protein interaction and complex prediction: Structure, dynamics and function.多聚体蛋白质相互作用与复合物预测:结构、动力学与功能
Comput Struct Biotechnol J. 2025 May 16;27:1975-1997. doi: 10.1016/j.csbj.2025.05.009. eCollection 2025.
猛禽X属性:一个用于蛋白质结构属性预测的网络服务器。
Nucleic Acids Res. 2016 Jul 8;44(W1):W430-5. doi: 10.1093/nar/gkw306. Epub 2016 Apr 25.
4
CoinFold: a web server for protein contact prediction and contact-assisted protein folding.CoinFold:用于蛋白质接触预测和接触辅助蛋白质折叠的网络服务器。
Nucleic Acids Res. 2016 Jul 8;44(W1):W361-6. doi: 10.1093/nar/gkw307. Epub 2016 Apr 25.
5
Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields.基于深度卷积神经场的蛋白质二级结构预测
Sci Rep. 2016 Jan 11;6:18962. doi: 10.1038/srep18962.
6
New encouraging developments in contact prediction: Assessment of the CASP11 results.接触预测方面新的鼓舞人心的进展:对CASP11结果的评估。
Proteins. 2016 Sep;84 Suppl 1(Suppl 1):131-44. doi: 10.1002/prot.24943. Epub 2015 Nov 17.
7
Protein contact prediction by integrating joint evolutionary coupling analysis and supervised learning.基于联合进化耦合分析和监督学习的蛋白质接触预测。
Bioinformatics. 2015 Nov 1;31(21):3506-13. doi: 10.1093/bioinformatics/btv472. Epub 2015 Aug 14.
8
Deep learning.深度学习。
Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.
9
CONFOLD: Residue-residue contact-guided ab initio protein folding.CONFOLD:基于残基-残基接触引导的从头算蛋白质折叠。
Proteins. 2015 Aug;83(8):1436-49. doi: 10.1002/prot.24829. Epub 2015 Jun 6.
10
The Phyre2 web portal for protein modeling, prediction and analysis.用于蛋白质建模、预测和分析的Phyre2网络门户。
Nat Protoc. 2015 Jun;10(6):845-58. doi: 10.1038/nprot.2015.053. Epub 2015 May 7.