使用序列-profile-profile 比较理论检测蛋白质家族之间的遥远进化关系。

Detection of distant evolutionary relationships between protein families using theory of sequence profile-profile comparison.

机构信息

Institute of Biotechnology, Graiciūno 8, LT-02241 Vilnius, Lithuania.

出版信息

BMC Bioinformatics. 2010 Feb 17;11:89. doi: 10.1186/1471-2105-11-89.

DOI:10.1186/1471-2105-11-89

PMID:20158924

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2837030/

Abstract

BACKGROUND

Detection of common evolutionary origin (homology) is a primary means of inferring protein structure and function. At present, comparison of protein families represented as sequence profiles is arguably the most effective homology detection strategy. However, finding the best way to represent evolutionary information of a protein sequence family in the profile, to compare profiles and to estimate the biological significance of such comparisons, remains an active area of research.

RESULTS

Here, we present a new homology detection method based on sequence profile-profile comparison. The method has a number of new features including position-dependent gap penalties and a global score system. Position-dependent gap penalties provide a more biologically relevant way to represent and align protein families as sequence profiles. The global score system enables an analytical solution of the statistical parameters needed to estimate the statistical significance of profile-profile similarities. The new method, together with other state-of-the-art profile-based methods (HHsearch, COMPASS and PSI-BLAST), is benchmarked in all-against-all comparison of a challenging set of SCOP domains that share at most 20% sequence identity. For benchmarking, we use a reference ("gold standard") free model-based evaluation framework. Evaluation results show that at the level of protein domains our method compares favorably to all other tested methods. We also provide examples of the new method outperforming structure-based similarity detection and alignment. The implementation of the new method both as a standalone software package and as a web server is available at http://www.ibt.lt/bioinformatics/coma.

CONCLUSION

Due to a number of developments, the new profile-profile comparison method shows an improved ability to match distantly related protein domains. Therefore, the method should be useful for annotation and homology modeling of uncharacterized proteins.

摘要

背景

检测共同的进化起源（同源性）是推断蛋白质结构和功能的主要手段。目前，比较序列特征表示的蛋白质家族被认为是最有效的同源检测策略。然而，在特征中找到表示蛋白质序列家族进化信息的最佳方法，比较特征并估计此类比较的生物学意义，仍然是一个活跃的研究领域。

结果

在这里，我们提出了一种基于序列特征-特征比较的新同源检测方法。该方法具有许多新功能，包括位置相关的空位罚分和全局评分系统。位置相关的空位罚分提供了一种更具生物学相关性的方法来表示和对齐蛋白质家族作为序列特征。全局评分系统使我们能够分析解决估计特征-特征相似性统计显著性所需的统计参数。该新方法与其他基于特征的最新方法（HHsearch、COMPASS 和 PSI-BLAST）一起，在 SCOP 域的全对全比较中进行了基准测试，这些域共享的序列同一性最多为 20%。对于基准测试，我们使用参考（“黄金标准”）无模型基于评估框架。评估结果表明，在蛋白质域的水平上，我们的方法与所有其他测试方法相比具有优势。我们还提供了新方法在表现优于结构相似性检测和比对的例子。该新方法的实现既作为独立的软件包，也作为网络服务器，可在 http://www.ibt.lt/bioinformatics/coma 上获得。

结论

由于多项发展，新的特征-特征比较方法显示出更好的匹配远距离相关蛋白质域的能力。因此，该方法对于未表征蛋白质的注释和同源建模应该是有用的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/61bb/2837030/bc537dbb117e/1471-2105-11-89-1.jpg

相似文献

Detection of distant evolutionary relationships between protein families using theory of sequence profile-profile comparison.

BMC Bioinformatics. 2010 Feb 17;11:89. doi: 10.1186/1471-2105-11-89.

Protein homology detection by HMM-HMM comparison.

Bioinformatics. 2005 Apr 1;21(7):951-60. doi: 10.1093/bioinformatics/bti125. Epub 2004 Nov 5.

High quality protein sequence alignment by combining structural profile prediction and profile alignment using SABER-TOOTH.

BMC Bioinformatics. 2010 May 14;11:251. doi: 10.1186/1471-2105-11-251.

ProClust: improved clustering of protein sequences with an extended graph-based approach.

Bioinformatics. 2002;18 Suppl 2:S182-91. doi: 10.1093/bioinformatics/18.suppl_2.s182.

COMPASS server for remote homology inference.

Nucleic Acids Res. 2007 Jul;35(Web Server issue):W653-8. doi: 10.1093/nar/gkm293. Epub 2007 May 21.

DescFold: a web server for protein fold recognition.

BMC Bioinformatics. 2009 Dec 14;10:416. doi: 10.1186/1471-2105-10-416.

SCOOP: a simple method for identification of novel protein superfamily relationships.

Bioinformatics. 2007 Apr 1;23(7):809-14. doi: 10.1093/bioinformatics/btm034. Epub 2007 Feb 3.

Within the twilight zone: a sensitive profile-profile comparison tool based on information theory.

J Mol Biol. 2002 Feb 1;315(5):1257-75. doi: 10.1006/jmbi.2001.5293.

ASH structure alignment package: sensitivity and selectivity in domain classification.

BMC Bioinformatics. 2007 Apr 4;8:116. doi: 10.1186/1471-2105-8-116.

COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance.

J Mol Biol. 2003 Feb 7;326(1):317-36. doi: 10.1016/s0022-2836(02)01371-2.

引用本文的文献

HH-suite3 for fast remote homology detection and deep protein annotation.

BMC Bioinformatics. 2019 Sep 14;20(1):473. doi: 10.1186/s12859-019-3019-7.

Estimating statistical significance of local protein profile-profile alignments.

BMC Bioinformatics. 2019 Aug 13;20(1):419. doi: 10.1186/s12859-019-2913-3.

IntFOLD: an integrated web resource for high performance protein structure and function prediction.

Nucleic Acids Res. 2019 Jul 2;47(W1):W408-W413. doi: 10.1093/nar/gkz322.

Homology Inference Based on a Reconciliation Approach for the Comparative Genomics of Protozoa.

Evol Bioinform Online. 2018 Jul 10;14:1176934318785138. doi: 10.1177/1176934318785138. eCollection 2018.

dRHP-PseRA: detecting remote homology proteins using profile-based pseudo protein sequence and rank aggregation.

Sci Rep. 2016 Sep 1;6:32333. doi: 10.1038/srep32333.

Using homology relations within a database markedly boosts protein sequence similarity search.

Proc Natl Acad Sci U S A. 2015 Jun 2;112(22):7003-8. doi: 10.1073/pnas.1424324112. Epub 2015 May 18.

Evaluation and improvements of clustering algorithms for detecting remote homologous protein families.

BMC Bioinformatics. 2015 Feb 5;16:34. doi: 10.1186/s12859-014-0445-4.

From local structure to a global framework: recognition of protein folds.

J R Soc Interface. 2014 Apr 16;11(95):20131147. doi: 10.1098/rsif.2013.1147. Print 2014 Jun 6.

A vitamin B₁₂ transporter in Mycobacterium tuberculosis.

Open Biol. 2013 Feb 13;3(2):120175. doi: 10.1098/rsob.120175.

Computational design of glutamate dehydrogenase in Bacillus subtilis natto.

J Mol Model. 2013 Apr;19(4):1919-27. doi: 10.1007/s00894-013-1755-6. Epub 2013 Jan 22.

本文引用的文献

PROCAIN: protein profile comparison with assisting information.

Nucleic Acids Res. 2009 Jun;37(11):3522-30. doi: 10.1093/nar/gkp212. Epub 2009 Apr 7.

Discrimination between distant homologs and structural analogs: lessons from manually constructed, reliable data sets.

J Mol Biol. 2008 Apr 4;377(4):1265-78. doi: 10.1016/j.jmb.2007.12.076. Epub 2008 Jan 5.

Accurate statistical model of comparison between multiple sequence alignments.

Nucleic Acids Res. 2008 Apr;36(7):2240-8. doi: 10.1093/nar/gkn065. Epub 2008 Feb 19.

A comprehensive system for evaluation of remote sequence similarity detection.

BMC Bioinformatics. 2007 Aug 28;8:314. doi: 10.1186/1471-2105-8-314.

Statistical significance in biological sequence analysis.

Brief Bioinform. 2006 Mar;7(1):2-24. doi: 10.1093/bib/bbk001.

Protein structure comparison: implications for the nature of 'fold space', and structure and function prediction.

Curr Opin Struct Biol. 2006 Jun;16(3):393-8. doi: 10.1016/j.sbi.2006.04.007. Epub 2006 May 4.

Natural history of S-adenosylmethionine-binding proteins.

BMC Struct Biol. 2005 Oct 14;5:19. doi: 10.1186/1472-6807-5-19.

Progress over the first decade of CASP experiments.

Proteins. 2005;61 Suppl 7:225-236. doi: 10.1002/prot.20740.

Protein homology detection by HMM-HMM comparison.

Bioinformatics. 2005 Apr 1;21(7):951-60. doi: 10.1093/bioinformatics/bti125. Epub 2004 Nov 5.

Scoring function for automated assessment of protein structure template quality.

Proteins. 2004 Dec 1;57(4):702-10. doi: 10.1002/prot.20264.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用序列-profile-profile 比较理论检测蛋白质家族之间的遥远进化关系。

Detection of distant evolutionary relationships between protein families using theory of sequence profile-profile comparison.

机构信息

Institute of Biotechnology, Graiciūno 8, LT-02241 Vilnius, Lithuania.

出版信息

BMC Bioinformatics. 2010 Feb 17;11:89. doi: 10.1186/1471-2105-11-89.

DOI:10.1186/1471-2105-11-89

PMID:20158924

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2837030/

Abstract

BACKGROUND

RESULTS

CONCLUSION

摘要

使用序列-profile-profile 比较理论检测蛋白质家族之间的遥远进化关系。

Detection of distant evolutionary relationships between protein families using theory of sequence profile-profile comparison.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

使用序列-profile-profile 比较理论检测蛋白质家族之间的遥远进化关系。

Detection of distant evolutionary relationships between protein families using theory of sequence profile-profile comparison.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献