Suppr超能文献

BaCelLo:一种平衡的亚细胞定位预测器。

BaCelLo: a balanced subcellular localization predictor.

作者信息

Pierleoni Andrea, Martelli Pier Luigi, Fariselli Piero, Casadio Rita

机构信息

Biocomputing Group, Dept. of Biology University of Bologna, via Irnerio 42, 40126 Bologna, Italy.

出版信息

Bioinformatics. 2006 Jul 15;22(14):e408-16. doi: 10.1093/bioinformatics/btl222.

Abstract

MOTIVATION

The knowledge of the subcellular localization of a protein is fundamental for elucidating its function. It is difficult to determine the subcellular location for eukaryotic cells with experimental high-throughput procedures. Computational procedures are then needed for annotating the subcellular location of proteins in large scale genomic projects.

RESULTS

BaCelLo is a predictor for five classes of subcellular localization (secretory pathway, cytoplasm, nucleus, mitochondrion and chloroplast) and it is based on different SVMs organized in a decision tree. The system exploits the information derived from the residue sequence and from the evolutionary information contained in alignment profiles. It analyzes the whole sequence composition and the compositions of both the N- and C-termini. The training set is curated in order to avoid redundancy. For the first time a balancing procedure is introduced in order to mitigate the effect of biased training sets. Three kingdom-specific predictors are implemented: for animals, plants and fungi, respectively. When distributing the proteins from animals and fungi into four classes, accuracy of BaCelLo reach 74% and 76%, respectively; a score of 67% is obtained when proteins from plants are distributed into five classes. BaCelLo outperforms the other presently available methods for the same task and gives more balanced accuracy and coverage values for each class. We also predict the subcellular localization of five whole proteomes, Homo sapiens, Mus musculus, Caenorhabditis elegans, Saccharomyces cerevisiae and Arabidopsis thaliana, comparing the protein content in each different compartment.

AVAILABILITY

BaCelLo can be accessed at http://www.biocomp.unibo.it/bacello/.

摘要

动机

了解蛋白质的亚细胞定位是阐明其功能的基础。通过实验高通量方法确定真核细胞的亚细胞定位很困难。因此,在大规模基因组计划中需要计算程序来注释蛋白质的亚细胞定位。

结果

BaCelLo是一种用于预测五类亚细胞定位(分泌途径、细胞质、细胞核、线粒体和叶绿体)的预测工具,它基于组织在决策树中的不同支持向量机。该系统利用从残基序列以及比对图谱中包含的进化信息中获得的信息。它分析整个序列组成以及N端和C端的组成。训练集经过精心策划以避免冗余。首次引入了一种平衡程序以减轻有偏差训练集的影响。实现了三种特定于生物界的预测工具:分别用于动物、植物和真菌。当将动物和真菌的蛋白质分为四类时,BaCelLo的准确率分别达到74%和76%;当将植物的蛋白质分为五类时,准确率为67%。BaCelLo在相同任务上优于其他现有方法,并且为每个类别提供了更平衡的准确率和覆盖率值。我们还预测了五个完整蛋白质组(智人、小家鼠、秀丽隐杆线虫、酿酒酵母和拟南芥)的亚细胞定位,比较了每个不同区室中的蛋白质含量。

可用性

可通过http://www.biocomp.unibo.it/bacello/访问BaCelLo。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验