• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估残基残基接触预测方法:从回顾性到前瞻性。

Evaluation of residue-residue contact prediction methods: From retrospective to prospective.

机构信息

University of Chinese Academy of Sciences, Beijing, China.

Centre for High Performance Computing, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.

出版信息

PLoS Comput Biol. 2021 May 24;17(5):e1009027. doi: 10.1371/journal.pcbi.1009027. eCollection 2021 May.

DOI:10.1371/journal.pcbi.1009027
PMID:34029314
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8177648/
Abstract

Sequence-based residue contact prediction plays a crucial role in protein structure reconstruction. In recent years, the combination of evolutionary coupling analysis (ECA) and deep learning (DL) techniques has made tremendous progress for residue contact prediction, thus a comprehensive assessment of current methods based on a large-scale benchmark data set is very needed. In this study, we evaluate 18 contact predictors on 610 non-redundant proteins and 32 CASP13 targets according to a wide range of perspectives. The results show that different methods have different application scenarios: (1) DL methods based on multi-categories of inputs and large training sets are the best choices for low-contact-density proteins such as the intrinsically disordered ones and proteins with shallow multi-sequence alignments (MSAs). (2) With at least 5L (L is sequence length) effective sequences in the MSA, all the methods show the best performance, and methods that rely only on MSA as input can reach comparable achievements as methods that adopt multi-source inputs. (3) For top L/5 and L/2 predictions, DL methods can predict more hydrophobic interactions while ECA methods predict more salt bridges and disulfide bonds. (4) ECA methods can detect more secondary structure interactions, while DL methods can accurately excavate more contact patterns and prune isolated false positives. In general, multi-input DL methods with large training sets dominate current approaches with the best overall performance. Despite the great success of current DL methods must be stated the fact that there is still much room left for further improvement: (1) With shallow MSAs, the performance will be greatly affected. (2) Current methods show lower precisions for inter-domain compared with intra-domain contact predictions, as well as very high imbalances in precisions between intra-domains. (3) Strong prediction similarities between DL methods indicating more feature types and diversified models need to be developed. (4) The runtime of most methods can be further optimized.

摘要

基于序列的残基接触预测在蛋白质结构重建中起着至关重要的作用。近年来,进化耦合分析(ECA)和深度学习(DL)技术的结合在残基接触预测方面取得了巨大的进展,因此非常需要基于大规模基准数据集对当前方法进行全面评估。在这项研究中,我们根据广泛的视角评估了 18 种接触预测器在 610 个非冗余蛋白和 32 个 CASP13 靶标上的表现。结果表明,不同的方法有不同的应用场景:(1)基于多类别输入和大型训练集的 DL 方法是低接触密度蛋白(如无序蛋白和浅多序列比对(MSA)蛋白)的最佳选择。(2)在 MSA 中至少有 5L(L 是序列长度)有效序列时,所有方法的表现都最好,仅依赖 MSA 作为输入的方法可以达到与采用多源输入的方法相当的水平。(3)对于前 L/5 和 L/2 预测,DL 方法可以预测更多的疏水相互作用,而 ECA 方法可以预测更多的盐桥和二硫键。(4)ECA 方法可以检测更多的二级结构相互作用,而 DL 方法可以准确地挖掘更多的接触模式并修剪孤立的假阳性。总的来说,具有大型训练集的多输入 DL 方法具有最佳的整体性能,占据主导地位。尽管当前的 DL 方法取得了巨大的成功,但仍有很大的改进空间:(1)在浅 MSA 的情况下,性能将受到很大影响。(2)与域内接触预测相比,当前方法在域间接触预测上的精度较低,并且域内精度的不平衡性非常高。(3)DL 方法之间的预测相似度很高,表明需要开发更多的特征类型和多样化的模型。(4)大多数方法的运行时间可以进一步优化。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/885d282321af/pcbi.1009027.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/0ba344310245/pcbi.1009027.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/889ff695183f/pcbi.1009027.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/a74c9e24deb6/pcbi.1009027.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/8428a9cab61b/pcbi.1009027.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/dada7f021300/pcbi.1009027.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/87dee5233c76/pcbi.1009027.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/4be620ff2404/pcbi.1009027.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/2c51331b61a5/pcbi.1009027.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/44851a2b04c3/pcbi.1009027.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/8f5cf0fb9849/pcbi.1009027.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/0285dc322262/pcbi.1009027.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/885d282321af/pcbi.1009027.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/0ba344310245/pcbi.1009027.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/889ff695183f/pcbi.1009027.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/a74c9e24deb6/pcbi.1009027.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/8428a9cab61b/pcbi.1009027.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/dada7f021300/pcbi.1009027.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/87dee5233c76/pcbi.1009027.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/4be620ff2404/pcbi.1009027.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/2c51331b61a5/pcbi.1009027.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/44851a2b04c3/pcbi.1009027.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/8f5cf0fb9849/pcbi.1009027.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/0285dc322262/pcbi.1009027.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d514/8177648/885d282321af/pcbi.1009027.g012.jpg

相似文献

1
Evaluation of residue-residue contact prediction methods: From retrospective to prospective.评估残基残基接触预测方法:从回顾性到前瞻性。
PLoS Comput Biol. 2021 May 24;17(5):e1009027. doi: 10.1371/journal.pcbi.1009027. eCollection 2021 May.
2
ComplexContact: a web server for inter-protein contact prediction using deep learning.复杂接触:一个使用深度学习进行蛋白质间接触预测的网络服务器。
Nucleic Acids Res. 2018 Jul 2;46(W1):W432-W437. doi: 10.1093/nar/gky420.
3
DeepECA: an end-to-end learning framework for protein contact prediction from a multiple sequence alignment.DeepECA:一种基于多重序列比对的蛋白质接触预测端到端学习框架。
BMC Bioinformatics. 2020 Jan 9;21(1):10. doi: 10.1186/s12859-019-3190-x.
4
Deep-learning contact-map guided protein structure prediction in CASP13.深度学习接触图指导的 CASP13 蛋白质结构预测。
Proteins. 2019 Dec;87(12):1149-1164. doi: 10.1002/prot.25792. Epub 2019 Aug 14.
5
A Web-Based Protocol for Interprotein Contact Prediction by Deep Learning.基于深度学习的蛋白质间接触预测的网络协议。
Methods Mol Biol. 2020;2074:67-80. doi: 10.1007/978-1-4939-9873-9_6.
6
rawMSA: End-to-end Deep Learning using raw Multiple Sequence Alignments.rawMSA:使用原始多序列比对的端到端深度学习。
PLoS One. 2019 Aug 15;14(8):e0220182. doi: 10.1371/journal.pone.0220182. eCollection 2019.
7
Improving deep learning-based protein distance prediction in CASP14.在蛋白质结构预测关键评估第14轮(CASP14)中改进基于深度学习的蛋白质距离预测
Bioinformatics. 2021 Oct 11;37(19):3190-3196. doi: 10.1093/bioinformatics/btab355.
8
Predicting protein residue-residue contacts using random forests and deep networks.利用随机森林和深度网络预测蛋白质残基-残基接触。
BMC Bioinformatics. 2019 Mar 14;20(Suppl 2):100. doi: 10.1186/s12859-019-2627-6.
9
COMTOP: Protein Residue-Residue Contact Prediction through Mixed Integer Linear Optimization.COMTOP:通过混合整数线性优化进行蛋白质残基-残基接触预测。
Membranes (Basel). 2021 Jun 30;11(7):503. doi: 10.3390/membranes11070503.
10
Analysis of several key factors influencing deep learning-based inter-residue contact prediction.分析影响基于深度学习的残基间接触预测的几个关键因素。
Bioinformatics. 2020 Feb 15;36(4):1091-1098. doi: 10.1093/bioinformatics/btz679.

引用本文的文献

1
Improving AlphaFold Predicted Contacts for Alpha-Helical Transmembrane Proteins Using Structural Features.利用结构特征改进针对 α-螺旋跨膜蛋白的 AlphaFold 预测接触。
Int J Mol Sci. 2024 May 11;25(10):5247. doi: 10.3390/ijms25105247.
2
Importance of Inter-residue Contacts for Understanding Protein Folding and Unfolding Rates, Remote Homology, and Drug Design.残基间接触对于理解蛋白质折叠与解折叠速率、远程同源性及药物设计的重要性。
Mol Biotechnol. 2025 Mar;67(3):862-884. doi: 10.1007/s12033-024-01119-4. Epub 2024 Mar 18.
3
Improving AlphaFold predicted contacts in alpha-helical transmembrane proteins structures using structural features.

本文引用的文献

1
Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.生物结构和功能源于将无监督学习扩展到 2.5 亿个蛋白质序列。
Proc Natl Acad Sci U S A. 2021 Apr 13;118(15). doi: 10.1073/pnas.2016239118.
2
FilterDCA: Interpretable supervised contact prediction using inter-domain coevolution.FilterDCA:基于域间共进化的可解释监督接触预测
PLoS Comput Biol. 2020 Oct 9;16(10):e1007621. doi: 10.1371/journal.pcbi.1007621. eCollection 2020 Oct.
3
Improved protein structure prediction using potentials from deep learning.
利用结构特征改善α-螺旋跨膜蛋白结构中AlphaFold预测的接触点
Res Sq. 2023 Oct 26:rs.3.rs-3475769. doi: 10.21203/rs.3.rs-3475769/v1.
4
DeepBindGCN: Integrating Molecular Vector Representation with Graph Convolutional Neural Networks for Protein-Ligand Interaction Prediction.DeepBindGCN:将分子向量表示与图卷积神经网络集成用于蛋白质-配体相互作用预测。
Molecules. 2023 Jun 10;28(12):4691. doi: 10.3390/molecules28124691.
5
Computational prediction of disordered binding regions.无序结合区域的计算预测
Comput Struct Biotechnol J. 2023 Feb 10;21:1487-1497. doi: 10.1016/j.csbj.2023.02.018. eCollection 2023.
6
Deep learning of protein sequence design of protein-protein interactions.深度学习蛋白质序列设计蛋白质-蛋白质相互作用。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac733.
7
Towards Molecular Understanding of the Functional Role of UbiJ-UbiK Complex in Ubiquinone Biosynthesis by Multiscale Molecular Modelling Studies.通过多尺度分子建模研究,深入了解 UbiJ-UbiK 复合物在泛醌生物合成中的功能作用的分子机制。
Int J Mol Sci. 2022 Sep 7;23(18):10323. doi: 10.3390/ijms231810323.
8
Inter-Residue Distance Prediction From Duet Deep Learning Models.基于二重深度学习模型的残基间距离预测
Front Genet. 2022 May 16;13:887491. doi: 10.3389/fgene.2022.887491. eCollection 2022.
9
Hybrid drug-screening strategy identifies potential SARS-CoV-2 cell-entry inhibitors targeting human transmembrane serine protease.混合药物筛选策略鉴定出靶向人跨膜丝氨酸蛋白酶的潜在新冠病毒细胞进入抑制剂。
Struct Chem. 2022;33(5):1503-1515. doi: 10.1007/s11224-022-01960-w. Epub 2022 May 11.
10
COMTOP: Protein Residue-Residue Contact Prediction through Mixed Integer Linear Optimization.COMTOP:通过混合整数线性优化进行蛋白质残基-残基接触预测。
Membranes (Basel). 2021 Jun 30;11(7):503. doi: 10.3390/membranes11070503.
利用深度学习势进行蛋白质结构预测的改进。
Nature. 2020 Jan;577(7792):706-710. doi: 10.1038/s41586-019-1923-7. Epub 2020 Jan 15.
4
Improved protein structure prediction using predicted interresidue orientations.利用预测的残基间取向改进蛋白质结构预测。
Proc Natl Acad Sci U S A. 2020 Jan 21;117(3):1496-1503. doi: 10.1073/pnas.1914677117. Epub 2020 Jan 2.
5
CGLFold: a contact-assisted de novo protein structure prediction using global exploration and loop perturbation sampling algorithm.CGLFold:一种基于全局探索和环扰动采样算法的接触辅助从头蛋白质结构预测方法。
Bioinformatics. 2020 Apr 15;36(8):2443-2450. doi: 10.1093/bioinformatics/btz943.
6
SPOT-Fold: Fragment-Free Protein Structure Prediction Guided by Predicted Backbone Structure and Contact Map.SPOT-Fold:基于预测的骨架结构和接触图指导的无片段蛋白质结构预测。
J Comput Chem. 2020 Mar 30;41(8):745-750. doi: 10.1002/jcc.26132. Epub 2019 Dec 17.
7
Assessing the accuracy of contact predictions in CASP13.评估 CASP13 中接触预测的准确性。
Proteins. 2019 Dec;87(12):1058-1068. doi: 10.1002/prot.25819. Epub 2019 Oct 24.
8
Analysis of several key factors influencing deep learning-based inter-residue contact prediction.分析影响基于深度学习的残基间接触预测的几个关键因素。
Bioinformatics. 2020 Feb 15;36(4):1091-1098. doi: 10.1093/bioinformatics/btz679.
9
Ensembling multiple raw coevolutionary features with deep residual neural networks for contact-map prediction in CASP13.基于深度残差神经网络的原始共进化特征集成方法在 CASP13 中用于接触图预测。
Proteins. 2019 Dec;87(12):1082-1091. doi: 10.1002/prot.25798. Epub 2019 Aug 22.
10
Distance-based protein folding powered by deep learning.基于深度学习的距离相关蛋白质折叠。
Proc Natl Acad Sci U S A. 2019 Aug 20;116(34):16856-16865. doi: 10.1073/pnas.1821309116. Epub 2019 Aug 9.