Suppr超能文献

用于蛋白质分类的简单无比对方法:以G蛋白偶联受体为例的研究

Simple alignment-free methods for protein classification: a case study from G-protein-coupled receptors.

作者信息

Strope Pooja K, Moriyama Etsuko N

机构信息

Department of Computer Science and Engineering, University of Nebraska-Lincoln, Lincoln, NE 68588-0660,

出版信息

Genomics. 2007 May;89(5):602-12. doi: 10.1016/j.ygeno.2007.01.008. Epub 2007 Mar 2.

Abstract

Computational methods of predicting protein functions rely on detecting similarities among proteins. However, sufficient sequence information is not always available for some protein families. For example, proteins of interest may be new members of a divergent protein family. The performance of protein classification methods could vary in such challenging situations. Using the G-protein-coupled receptor superfamily as an example, we investigated the performance of several protein classifiers. Alignment-free classifiers based on support vector machines using simple amino acid compositions were effective in remote-similarity detection even from short fragmented sequences. Although it is computationally expensive, a support vector machine classifier using local pairwise alignment scores showed very good balanced performance. More commonly used profile hidden Markov models were generally highly specific and well suited to classifying well-established protein family members. It is suggested that different types of protein classifiers should be applied to gain the optimal mining power.

摘要

预测蛋白质功能的计算方法依赖于检测蛋白质之间的相似性。然而,对于某些蛋白质家族来说,并非总能获得足够的序列信息。例如,感兴趣的蛋白质可能是一个分化蛋白质家族的新成员。在这种具有挑战性的情况下,蛋白质分类方法的性能可能会有所不同。以G蛋白偶联受体超家族为例,我们研究了几种蛋白质分类器的性能。基于支持向量机并使用简单氨基酸组成的无比对分类器,即使从短片段序列中也能有效地进行远程相似性检测。尽管计算成本很高,但使用局部两两比对得分的支持向量机分类器表现出非常好的平衡性能。更常用的轮廓隐马尔可夫模型通常具有高度特异性,非常适合对已确立的蛋白质家族成员进行分类。建议应应用不同类型的蛋白质分类器以获得最佳挖掘能力。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验