• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过结合轮廓-轮廓比对和支持向量机进行折叠识别。

Fold recognition by combining profile-profile alignment and support vector machine.

作者信息

Han Sangjo, Lee Byung-Chul, Yu Seung Taek, Jeong Chan-Seok, Lee Soyoung, Kim Dongsup

机构信息

Department of Biosystems, Korea Advanced Institute of Science and Technology, Daejeon, 305-701, Korea.

出版信息

Bioinformatics. 2005 Jun 1;21(11):2667-73. doi: 10.1093/bioinformatics/bti384. Epub 2005 Mar 15.

DOI:10.1093/bioinformatics/bti384
PMID:15769835
Abstract

MOTIVATION

Currently, the most accurate fold-recognition method is to perform profile-profile alignments and estimate the statistical significances of those alignments by calculating Z-score or E-value. Although this scheme is reliable in recognizing relatively close homologs related at the family level, it has difficulty in finding the remote homologs that are related at the superfamily or fold level.

RESULTS

In this paper, we present an alternative method to estimate the significance of the alignments. The alignment between a query protein and a template of length n in the fold library is transformed into a feature vector of length n + 1, which is then evaluated by support vector machine (SVM). The output from SVM is converted to a posterior probability that a query sequence is related to a template, given SVM output. Results show that a new method shows significantly better performance than PSI-BLAST and profile-profile alignment with Z-score scheme. While PSI-BLAST and Z-score scheme detect 16 and 20% of superfamily-related proteins, respectively, at 90% specificity, a new method detects 46% of these proteins, resulting in more than 2-fold increase in sensitivity. More significantly, at the fold level, a new method can detect 14% of remotely related proteins at 90% specificity, a remarkable result considering the fact that the other methods can detect almost none at the same level of specificity.

摘要

动机

目前,最准确的折叠识别方法是进行轮廓-轮廓比对,并通过计算Z分数或E值来估计这些比对的统计显著性。尽管该方案在识别家族水平上相关的相对紧密的同源物方面是可靠的,但在寻找超家族或折叠水平上相关的远源同源物时却存在困难。

结果

在本文中,我们提出了一种估计比对显著性的替代方法。查询蛋白与折叠库中长度为n的模板之间的比对被转换为长度为n + 1的特征向量,然后由支持向量机(SVM)进行评估。给定SVM输出,SVM的输出被转换为查询序列与模板相关的后验概率。结果表明,新方法的性能明显优于PSI-BLAST和采用Z分数方案的轮廓-轮廓比对。在90%的特异性下,PSI-BLAST和Z分数方案分别检测到16%和20%的超家族相关蛋白,而新方法检测到46%的这些蛋白,灵敏度提高了两倍多。更显著的是,在折叠水平上,新方法在90%的特异性下可以检测到14%的远源相关蛋白,考虑到其他方法在相同特异性水平下几乎检测不到任何蛋白,这是一个显著的结果。

相似文献

1
Fold recognition by combining profile-profile alignment and support vector machine.通过结合轮廓-轮廓比对和支持向量机进行折叠识别。
Bioinformatics. 2005 Jun 1;21(11):2667-73. doi: 10.1093/bioinformatics/bti384. Epub 2005 Mar 15.
2
Remote protein homology detection and fold recognition using two-layer support vector machine classifiers.使用两层支持向量机分类器进行远程蛋白质同源检测和折叠识别。
Comput Biol Med. 2011 Aug;41(8):687-99. doi: 10.1016/j.compbiomed.2011.06.004. Epub 2011 Jun 25.
3
Support vector machines with profile-based kernels for remote protein homology detection.用于远程蛋白质同源性检测的基于轮廓核的支持向量机。
Genome Inform. 2004;15(2):191-200.
4
SVM-HUSTLE--an iterative semi-supervised machine learning approach for pairwise protein remote homology detection.SVM-HUSTLE——一种用于成对蛋白质远程同源性检测的迭代半监督机器学习方法。
Bioinformatics. 2008 Mar 15;24(6):783-90. doi: 10.1093/bioinformatics/btn028. Epub 2008 Feb 1.
5
Application of latent semantic analysis to protein remote homology detection.潜在语义分析在蛋白质远程同源性检测中的应用。
Bioinformatics. 2006 Feb 1;22(3):285-90. doi: 10.1093/bioinformatics/bti801. Epub 2005 Nov 29.
6
SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition.支持向量机折叠法:一种用于判别式多类别蛋白质折叠和超家族识别的工具。
BMC Bioinformatics. 2007 May 22;8 Suppl 4(Suppl 4):S2. doi: 10.1186/1471-2105-8-S4-S2.
7
Protein remote homology detection based on auto-cross covariance transformation.基于自交协方差变换的蛋白质远程同源检测。
Comput Biol Med. 2011 Aug;41(8):640-7. doi: 10.1016/j.compbiomed.2011.05.015. Epub 2011 Jun 12.
8
Application of nonnegative matrix factorization to improve profile-profile alignment features for fold recognition and remote homolog detection.非负矩阵分解在改善用于折叠识别和远程同源物检测的轮廓-轮廓比对特征方面的应用。
BMC Bioinformatics. 2008 Jul 1;9:298. doi: 10.1186/1471-2105-9-298.
9
Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection.概率多类多核学习:用于蛋白质折叠识别和远程同源性检测
Bioinformatics. 2008 May 15;24(10):1264-70. doi: 10.1093/bioinformatics/btn112. Epub 2008 Mar 31.
10
A machine learning information retrieval approach to protein fold recognition.一种用于蛋白质折叠识别的机器学习信息检索方法。
Bioinformatics. 2006 Jun 15;22(12):1456-63. doi: 10.1093/bioinformatics/btl102. Epub 2006 Mar 17.

引用本文的文献

1
Recognition of 27-class protein folds by adding the interaction of segments and motif information.通过添加片段相互作用和基序信息来识别27类蛋白质折叠。
Biomed Res Int. 2014;2014:262850. doi: 10.1155/2014/262850. Epub 2014 Jul 21.
2
eThread: a highly optimized machine learning-based approach to meta-threading and the modeling of protein tertiary structures.eThread:一种高度优化的基于机器学习的元线程和蛋白质三级结构建模方法。
PLoS One. 2012;7(11):e50200. doi: 10.1371/journal.pone.0050200. Epub 2012 Nov 21.
3
Boosting Protein Threading Accuracy.
提高蛋白质穿线法的准确性。
Res Comput Mol Biol. 2009;5541:31-45. doi: 10.1007/978-3-642-02008-7_3.
4
RaptorX: exploiting structure information for protein alignment by statistical inference.RaptorX:通过统计推断利用结构信息进行蛋白质比对。
Proteins. 2011;79 Suppl 10(Suppl 10):161-71. doi: 10.1002/prot.23175. Epub 2011 Oct 11.
5
Conotoxin protein classification using free scores of words and support vector machines.使用单词的自由分数和支持向量机对 conotoxin 蛋白进行分类。
BMC Bioinformatics. 2011 May 29;12:217. doi: 10.1186/1471-2105-12-217.
6
Application of nonnegative matrix factorization to improve profile-profile alignment features for fold recognition and remote homolog detection.非负矩阵分解在改善用于折叠识别和远程同源物检测的轮廓-轮廓比对特征方面的应用。
BMC Bioinformatics. 2008 Jul 1;9:298. doi: 10.1186/1471-2105-9-298.
7
Predicting and improving the protein sequence alignment quality by support vector regression.通过支持向量回归预测并提高蛋白质序列比对质量。
BMC Bioinformatics. 2007 Dec 3;8:471. doi: 10.1186/1471-2105-8-471.
8
Functional annotation prediction: all for one and one for all.功能注释预测:人人为我,我为人人。
Protein Sci. 2006 Jun;15(6):1557-62. doi: 10.1110/ps.062185706. Epub 2006 May 2.