• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于序列和基于模板的蛋白质接触预测方法的综合评估。

A comprehensive assessment of sequence-based and template-based methods for protein contact prediction.

作者信息

Wu Sitao, Zhang Yang

机构信息

Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Dr, Lawrence, KS 66047, USA.

出版信息

Bioinformatics. 2008 Apr 1;24(7):924-31. doi: 10.1093/bioinformatics/btn069. Epub 2008 Feb 22.

DOI:10.1093/bioinformatics/btn069
PMID:18296462
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2648832/
Abstract

MOTIVATION

Pair-wise residue-residue contacts in proteins can be predicted from both threading templates and sequence-based machine learning. However, most structure modeling approaches only use the template-based contact predictions in guiding the simulations; this is partly because the sequence-based contact predictions are usually considered to be less accurate than that by threading. With the rapid progress in sequence databases and machine-learning techniques, it is necessary to have a detailed and comprehensive assessment of the contact-prediction methods in different template conditions.

RESULTS

We develop two methods for protein-contact predictions: SVM-SEQ is a sequence-based machine learning approach which trains a variety of sequence-derived features on contact maps; SVM-LOMETS collects consensus contact predictions from multiple threading templates. We test both methods on the same set of 554 proteins which are categorized into 'Easy', 'Medium', 'Hard' and 'Very Hard' targets based on the evolutionary and structural distance between templates and targets. For the Easy and Medium targets, SVM-LOMETS obviously outperforms SVM-SEQ; but for the Hard and Very Hard targets, the accuracy of the SVM-SEQ predictions is higher than that of SVM-LOMETS by 12-25%. If we combine the SVM-SEQ and SVM-LOMETS predictions together, the total number of correctly predicted contacts in the Hard proteins will increase by more than 60% (or 70% for the long-range contact with a sequence separation > or =24), compared with SVM-LOMETS alone. The advantage of SVM-SEQ is also shown in the CASP7 free modeling targets where the SVM-SEQ is around four times more accurate than SVM-LOMETS in the long-range contact prediction. These data demonstrate that the state-of-the-art sequence-based contact prediction has reached a level which may be helpful in assisting tertiary structure modeling for the targets which do not have close structure templates. The maximum yield should be obtained by the combination of both sequence- and template-based predictions.

摘要

动机

蛋白质中残基与残基之间的成对接触可以通过穿线模板和基于序列的机器学习来预测。然而,大多数结构建模方法仅使用基于模板的接触预测来指导模拟;部分原因是基于序列的接触预测通常被认为不如穿线法准确。随着序列数据库和机器学习技术的快速发展,有必要在不同模板条件下对接触预测方法进行详细而全面的评估。

结果

我们开发了两种蛋白质接触预测方法:SVM-SEQ是一种基于序列的机器学习方法,它在接触图上训练各种从序列衍生的特征;SVM-LOMETS从多个穿线模板收集一致的接触预测。我们在同一组554个蛋白质上测试了这两种方法,这些蛋白质根据模板与目标之间的进化和结构距离被分类为“简单”、“中等”、“困难”和“非常困难”目标。对于简单和中等目标,SVM-LOMETS明显优于SVM-SEQ;但对于困难和非常困难目标,SVM-SEQ预测的准确率比SVM-LOMETS高12-25%。如果我们将SVM-SEQ和SVM-LOMETS的预测结合在一起,与单独使用SVM-LOMETS相比,困难蛋白质中正确预测的接触总数将增加60%以上(对于序列间隔>或=24的长程接触则增加70%)。SVM-SEQ的优势在CASP7自由建模目标中也得到了体现,在长程接触预测方面,SVM-SEQ比SVM-LOMETS准确约四倍。这些数据表明,基于序列的最新接触预测已经达到了一个水平,可能有助于为没有紧密结构模板的目标辅助三级结构建模。通过结合基于序列和基于模板的预测应能获得最大收益。

相似文献

1
A comprehensive assessment of sequence-based and template-based methods for protein contact prediction.基于序列和基于模板的蛋白质接触预测方法的综合评估。
Bioinformatics. 2008 Apr 1;24(7):924-31. doi: 10.1093/bioinformatics/btn069. Epub 2008 Feb 22.
2
Improving consensus contact prediction via server correlation reduction.通过降低服务器相关性来改进一致性接触预测。
BMC Struct Biol. 2009 May 6;9:28. doi: 10.1186/1472-6807-9-28.
3
Improved residue contact prediction using support vector machines and a large feature set.使用支持向量机和大量特征集改进残基接触预测。
BMC Bioinformatics. 2007 Apr 2;8:113. doi: 10.1186/1471-2105-8-113.
4
Predicting residue-wise contact orders in proteins by support vector regression.通过支持向量回归预测蛋白质中残基水平的接触序。
BMC Bioinformatics. 2006 Oct 3;7:425. doi: 10.1186/1471-2105-7-425.
5
Template-based protein structure prediction in CASP11 and retrospect of I-TASSER in the last decade.CASP11中基于模板的蛋白质结构预测及I-TASSER在过去十年的回顾。
Proteins. 2016 Sep;84 Suppl 1(Suppl 1):233-46. doi: 10.1002/prot.24918. Epub 2015 Sep 18.
6
SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition.支持向量机折叠法:一种用于判别式多类别蛋白质折叠和超家族识别的工具。
BMC Bioinformatics. 2007 May 22;8 Suppl 4(Suppl 4):S2. doi: 10.1186/1471-2105-8-S4-S2.
7
Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs.基于支持向量机,利用氨基酸残基和氨基酸残基对的结构特性对蛋白质折叠进行分类。
Bioinformatics. 2007 Dec 15;23(24):3320-7. doi: 10.1093/bioinformatics/btm527. Epub 2007 Nov 7.
8
Ab initio and template-based prediction of multi-class distance maps by two-dimensional recursive neural networks.基于二维递归神经网络的多类别距离图的从头预测和基于模板的预测。
BMC Struct Biol. 2009 Jan 30;9:5. doi: 10.1186/1472-6807-9-5.
9
LOMETS: a local meta-threading-server for protein structure prediction.LOMETS:一种用于蛋白质结构预测的局部元线程服务器。
Nucleic Acids Res. 2007;35(10):3375-82. doi: 10.1093/nar/gkm251. Epub 2007 May 3.
10
Benchmarking of TASSER_2.0: an improved protein structure prediction algorithm with more accurate predicted contact restraints.TASSER_2.0的基准测试:一种具有更准确预测接触限制的改进型蛋白质结构预测算法。
Biophys J. 2008 Aug;95(4):1956-64. doi: 10.1529/biophysj.108.129759. Epub 2008 May 16.

引用本文的文献

1
Prediction and Evaluation of Protein Aggregation with Computational Methods.运用计算方法预测和评估蛋白质聚集
Methods Mol Biol. 2025;2867:299-314. doi: 10.1007/978-1-0716-4196-5_17.
2
Collectively encoding protein properties enriches protein language models.整体编码蛋白质特性可以丰富蛋白质语言模型。
BMC Bioinformatics. 2022 Nov 8;23(1):467. doi: 10.1186/s12859-022-05031-z.
3
Potential inhibitory activity of phytoconstituents against black fungus: ADMET, molecular docking and MD simulation studies.植物成分对黑木耳的潜在抑制活性:ADMET、分子对接和分子动力学模拟研究
Comput Toxicol. 2022 Nov;24:100247. doi: 10.1016/j.comtox.2022.100247. Epub 2022 Sep 24.
4
Multispectral and Molecular Docking Studies Reveal Potential Effectiveness of Antidepressant Fluoxetine by Forming π-Acceptor Complexes.多光谱和分子对接研究揭示了抗抑郁药氟西汀通过形成π-受体复合物的潜在有效性。
Molecules. 2022 Sep 10;27(18):5883. doi: 10.3390/molecules27185883.
5
Attempting to Increase the Effectiveness of the Antidepressant Trazodone Hydrochloride Drug Using π-Acceptors.尝试使用π-受体来提高抗抑郁药盐酸曲唑酮的疗效。
Int J Environ Res Public Health. 2022 Sep 8;19(18):11281. doi: 10.3390/ijerph191811281.
6
Fast and accurate Ab Initio Protein structure prediction using deep learning potentials.使用深度学习势能进行快速准确的从头开始蛋白质结构预测。
PLoS Comput Biol. 2022 Sep 16;18(9):e1010539. doi: 10.1371/journal.pcbi.1010539. eCollection 2022 Sep.
7
Protein Science Meets Artificial Intelligence: A Systematic Review and a Biochemical Meta-Analysis of an Inter-Field.蛋白质科学与人工智能相遇:跨领域的系统评价与生化荟萃分析
Front Bioeng Biotechnol. 2022 Jul 7;10:788300. doi: 10.3389/fbioe.2022.788300. eCollection 2022.
8
Machine Learning Advances in Microbiology: A Review of Methods and Applications.微生物学中的机器学习进展:方法与应用综述
Front Microbiol. 2022 May 26;13:925454. doi: 10.3389/fmicb.2022.925454. eCollection 2022.
9
Inter-Residue Distance Prediction From Duet Deep Learning Models.基于二重深度学习模型的残基间距离预测
Front Genet. 2022 May 16;13:887491. doi: 10.3389/fgene.2022.887491. eCollection 2022.
10
Enhancement of Haloperidol Binding Affinity to Dopamine Receptor via Forming a Charge-Transfer Complex with Picric Acid and 7,7,8,8-Tetracyanoquinodimethane for Improvement of the Antipsychotic Efficacy.通过与苦味酸和 7,7,8,8-四氰基对醌二甲烷形成电荷转移复合物来增强氟哌啶醇与多巴胺受体的结合亲和力,以提高抗精神病疗效。
Molecules. 2022 May 20;27(10):3295. doi: 10.3390/molecules27103295.

本文引用的文献

1
Progress and challenges in protein structure prediction.蛋白质结构预测的进展与挑战
Curr Opin Struct Biol. 2008 Jun;18(3):342-8. doi: 10.1016/j.sbi.2008.02.004. Epub 2008 Apr 22.
2
Contact prediction using mutual information and neural nets.使用互信息和神经网络进行接触预测。
Proteins. 2007;69 Suppl 8:159-64. doi: 10.1002/prot.21791.
3
Assessment of CASP7 structure predictions for template free targets.对无模板靶标的CASP7结构预测的评估。
Proteins. 2007;69 Suppl 8:57-67. doi: 10.1002/prot.21771.
4
Assessment of intramolecular contact predictions for CASP7.对CASP7分子内接触预测的评估。
Proteins. 2007;69 Suppl 8:152-8. doi: 10.1002/prot.21637.
5
Ab initio modeling of small proteins by iterative TASSER simulations.通过迭代TASSER模拟对小蛋白质进行从头建模。
BMC Biol. 2007 May 8;5:17. doi: 10.1186/1741-7007-5-17.
6
LOMETS: a local meta-threading-server for protein structure prediction.LOMETS:一种用于蛋白质结构预测的局部元线程服务器。
Nucleic Acids Res. 2007;35(10):3375-82. doi: 10.1093/nar/gkm251. Epub 2007 May 3.
7
Improved residue contact prediction using support vector machines and a large feature set.使用支持向量机和大量特征集改进残基接触预测。
BMC Bioinformatics. 2007 Apr 2;8:113. doi: 10.1186/1471-2105-8-113.
8
Predicting residue contacts using pragmatic correlated mutations method: reducing the false positives.使用实用相关突变方法预测残基接触:减少假阳性
BMC Bioinformatics. 2006 Nov 16;7:503. doi: 10.1186/1471-2105-7-503.
9
A two-stage approach for improved prediction of residue contact maps.一种用于改进残基接触图预测的两阶段方法。
BMC Bioinformatics. 2006 Mar 30;7:180. doi: 10.1186/1471-2105-7-180.
10
Physically realistic homology models built with ROSETTA can be more accurate than their templates.利用ROSETTA构建的物理逼真的同源模型可能比其模板更准确。
Proc Natl Acad Sci U S A. 2006 Apr 4;103(14):5361-6. doi: 10.1073/pnas.0509355103. Epub 2006 Mar 27.