Suppr超能文献

使用集成分类器识别膜蛋白类型。

Using ensemble classifier to identify membrane protein types.

作者信息

Shen H-B, Chou K-C

机构信息

Institute of Image Processing and Pattern Recognition, Shanghai Jiaotong University, Shanghai, China.

出版信息

Amino Acids. 2007;32(4):483-8. doi: 10.1007/s00726-006-0439-2. Epub 2006 Oct 12.

Abstract

Predicting membrane protein type is both an important and challenging topic in current molecular and cellular biology. This is because knowledge of membrane protein type often provides useful clues for determining, or sheds light upon, the function of an uncharacterized membrane protein. With the explosion of newly-found protein sequences in the post-genomic era, it is in a great demand to develop a computational method for fast and reliably identifying the types of membrane proteins according to their primary sequences. In this paper, a novel classifier, the so-called "ensemble classifier", was introduced. It is formed by fusing a set of nearest neighbor (NN) classifiers, each of which is defined in a different pseudo amino acid composition space. The type for a query protein is determined by the outcome of voting among these constituent individual classifiers. It was demonstrated through the self-consistency test, jackknife test, and independent dataset test that the ensemble classifier outperformed other existing classifiers widely used in biological literatures. It is anticipated that the idea of ensemble classifier can also be used to improve the prediction quality in classifying other attributes of proteins according to their sequences.

摘要

预测膜蛋白类型是当前分子与细胞生物学中一个既重要又具有挑战性的课题。这是因为膜蛋白类型的知识常常为确定未知膜蛋白的功能提供有用线索或有所启发。随着后基因组时代新发现蛋白质序列的激增,迫切需要开发一种计算方法,以便根据膜蛋白的一级序列快速且可靠地识别其类型。本文介绍了一种新型分类器,即所谓的“集成分类器”。它由一组最近邻(NN)分类器融合而成,每个分类器都在不同的伪氨基酸组成空间中定义。查询蛋白的类型由这些组成的单个分类器之间的投票结果决定。通过自一致性测试、留一法测试和独立数据集测试表明,集成分类器优于生物文献中广泛使用的其他现有分类器。预计集成分类器的理念也可用于提高根据蛋白质序列对其其他属性进行分类时的预测质量。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验