CNRS UMR 6026, ICM, Equipe B@SIC, Université de Rennes 1, Campus de Beaulieu, 35042 Rennes, France.
BMC Microbiol. 2010 Mar 23;10:88. doi: 10.1186/1471-2180-10-88.
The functions of proteins are strongly related to their localization in cell compartments (for example the cytoplasm or membranes) but the experimental determination of the sub-cellular localization of proteomes is laborious and expensive. A fast and low-cost alternative approach is in silico prediction, based on features of the protein primary sequences. However, biologists are confronted with a very large number of computational tools that use different methods that address various localization features with diverse specificities and sensitivities. As a result, exploiting these computer resources to predict protein localization accurately involves querying all tools and comparing every prediction output; this is a painstaking task. Therefore, we developed a comprehensive database, called CoBaltDB, that gathers all prediction outputs concerning complete prokaryotic proteomes.
The current version of CoBaltDB integrates the results of 43 localization predictors for 784 complete bacterial and archaeal proteomes (2.548.292 proteins in total). CoBaltDB supplies a simple user-friendly interface for retrieving and exploring relevant information about predicted features (such as signal peptide cleavage sites and transmembrane segments). Data are organized into three work-sets ("specialized tools", "meta-tools" and "additional tools"). The database can be queried using the organism name, a locus tag or a list of locus tags and may be browsed using numerous graphical and text displays.
With its new functionalities, CoBaltDB is a novel powerful platform that provides easy access to the results of multiple localization tools and support for predicting prokaryotic protein localizations with higher confidence than previously possible. CoBaltDB is available at http://www.umr6026.univ-rennes1.fr/english/home/research/basic/software/cobalten.
蛋白质的功能与其在细胞区室(例如细胞质或膜)中的定位密切相关,但对蛋白质组亚细胞定位的实验测定既费力又昂贵。一种快速且低成本的替代方法是基于蛋白质一级序列特征的计算预测。然而,生物学家面临着大量使用不同方法的计算工具,这些方法针对各种不同的定位特征,具有不同的特异性和敏感性。因此,利用这些计算机资源准确预测蛋白质定位需要查询所有工具并比较每个预测输出;这是一项艰苦的任务。因此,我们开发了一个综合数据库,称为 CoBaltDB,它汇集了所有关于完整原核蛋白质组的预测输出。
当前版本的 CoBaltDB 集成了 43 种定位预测器对 784 个完整细菌和古菌蛋白质组(总计 2,548,292 个蛋白质)的预测结果。CoBaltDB 提供了一个简单易用的用户界面,用于检索和探索有关预测特征(如信号肽切割位点和跨膜片段)的相关信息。数据分为三个工作集(“专业工具”、“元工具”和“附加工具”)。可以使用生物体名称、基因座标签或基因座标签列表查询数据库,并可以使用多种图形和文本显示进行浏览。
CoBaltDB 具有新功能,是一个新的强大平台,它提供了对多个定位工具结果的便捷访问,并支持比以前更有信心地预测原核蛋白质的定位。CoBaltDB 可在 http://www.umr6026.univ-rennes1.fr/english/home/research/basic/software/cobalten 上获取。