Suppr超能文献

使用双谱贝叶斯特征提取预测疟原虫的线粒体蛋白。

Prediction of mitochondrial proteins of malaria parasite using bi-profile Bayes feature extraction.

机构信息

Department of Mathematics, Northeastern University, Shenyang, China.

出版信息

Biochimie. 2011 Apr;93(4):778-82. doi: 10.1016/j.biochi.2011.01.013. Epub 2011 Jan 31.

Abstract

Mitochondrial proteins of Plasmodium falciparum are considered as attractive targets for anti-malarial drugs, but the experimental identification of these proteins is a difficult and time-consuming task. Computational prediction of mitochondrial proteins offers an alternative approach. However, the commonly used subcellular location prediction methods are unsuited for P. falciparum mitochondrial proteins whereas the organism and organelle-specific methods were constructed on the basis of a rather small dataset. In this study, a novel dataset termed PfM233, which included 108 mitochondrial and 125 non-mitochondrial proteins with sequence similarity below 25%, was established and the methods for predicting mitochondrial proteins of P. falciparum were described. Both bi-profile Bayes and split amino acid composition were applied to extract the features from the N- and C-terminal sequences of these proteins, which were then used to construct two SVM based classifiers (PfMP-N25 and PfMP-30). Using PfM233 as the dataset, PfMP-N25 and PfMP-30 achieved accuracies (MCCs) of 90.13% (0.80) and 90.99% (0.82). When tested with the commonly used 40 mitochondrial proteins in PfM175 and the 108 mitochondrial proteins in PfM233, these two methods obviously outperformed the existing general, organelle-specific and organism and organelle-specific methods.

摘要

疟原虫的线粒体蛋白被认为是抗疟药物的有吸引力的靶标,但这些蛋白质的实验鉴定是一项困难且耗时的任务。计算预测线粒体蛋白质提供了一种替代方法。然而,常用的亚细胞定位预测方法不适用于疟原虫线粒体蛋白,而生物体和细胞器特异性方法是基于相当小的数据集构建的。在这项研究中,建立了一个新的数据集 PfM233,其中包括 108 种线粒体和 125 种非线粒体蛋白质,它们的序列相似性低于 25%,并描述了预测疟原虫线粒体蛋白质的方法。双谱贝叶斯和分裂氨基酸组成都被应用于从这些蛋白质的 N-和 C-末端序列中提取特征,然后用于构建两个基于 SVM 的分类器(PfMP-N25 和 PfMP-30)。使用 PfM233 作为数据集,PfMP-N25 和 PfMP-30 的准确性(MCC)分别为 90.13%(0.80)和 90.99%(0.82)。当用 PfM175 中常用的 40 种线粒体蛋白质和 PfM233 中的 108 种线粒体蛋白质进行测试时,这两种方法明显优于现有的通用、细胞器特异性和生物体和细胞器特异性方法。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验