Suppr超能文献

基于PSI-BLAST和机器学习的蛋白质亚细胞定位

Protein subcellular localization based on PSI-BLAST and machine learning.

作者信息

Guo Jian, Pu Xian, Lin Yuanlie, Leung Howard

机构信息

Laboratory of Statistical Computation, Department of Mathematical Sciences, Tsinghua University, China.

出版信息

J Bioinform Comput Biol. 2006 Dec;4(6):1181-95. doi: 10.1142/s0219720006002405.

Abstract

Subcellular location is an important functional annotation of proteins. An automatic, reliable and efficient prediction system for protein subcellular localization is necessary for large-scale genome analysis. This paper describes a protein subcellular localization method which extracts features from protein profiles rather than from amino acid sequences. The protein profile represents a protein family, discards part of the sequence information that is not conserved throughout the family and therefore is more sensitive than the amino acid sequence. The amino acid compositions of whole profile and the N-terminus of the profile are extracted, respectively, to train and test the probabilistic neural network classifiers. On two benchmark datasets, the overall accuracies of the proposed method reach 89.1% and 68.9%, respectively. The prediction results show that the proposed method perform better than those methods based on amino acid sequences. The prediction results of the proposed method are also compared with Subloc on two redundance-reduced datasets.

摘要

亚细胞定位是蛋白质的一项重要功能注释。对于大规模基因组分析而言,一个自动、可靠且高效的蛋白质亚细胞定位预测系统是必不可少的。本文描述了一种蛋白质亚细胞定位方法,该方法从蛋白质轮廓而非氨基酸序列中提取特征。蛋白质轮廓代表一个蛋白质家族,丢弃了该家族中不保守的部分序列信息,因此比氨基酸序列更具敏感性。分别提取整个轮廓和轮廓N端的氨基酸组成,用于训练和测试概率神经网络分类器。在两个基准数据集上,所提方法的总体准确率分别达到了89.1%和68.9%。预测结果表明,所提方法比基于氨基酸序列的方法表现更好。所提方法的预测结果还在两个去冗余数据集上与Subloc进行了比较。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验