Instituto Gulbenkian de Ciência, Oeiras, Portugal.
PLoS Comput Biol. 2011 Oct;7(10):e1002217. doi: 10.1371/journal.pcbi.1002217. Epub 2011 Oct 13.
Rab proteins are small GTPases that act as essential regulators of vesicular trafficking. 44 subfamilies are known in humans, performing specific sets of functions at distinct subcellular localisations and tissues. Rab function is conserved even amongst distant orthologs. Hence, the annotation of Rabs yields functional predictions about the cell biology of trafficking. So far, annotating Rabs has been a laborious manual task not feasible for current and future genomic output of deep sequencing technologies. We developed, validated and benchmarked the Rabifier, an automated bioinformatic pipeline for the identification and classification of Rabs, which achieves up to 90% classification accuracy. We cataloged roughly 8.000 Rabs from 247 genomes covering the entire eukaryotic tree. The full Rab database and a web tool implementing the pipeline are publicly available at www.RabDB.org. For the first time, we describe and analyse the evolution of Rabs in a dataset covering the whole eukaryotic phylogeny. We found a highly dynamic family undergoing frequent taxon-specific expansions and losses. We dated the origin of human subfamilies using phylogenetic profiling, which enlarged the Rab repertoire of the Last Eukaryotic Common Ancestor with Rab14, 32 and RabL4. Furthermore, a detailed analysis of the Choanoflagellate Monosiga brevicollis Rab family pinpointed the changes that accompanied the emergence of Metazoan multicellularity, mainly an important expansion and specialisation of the secretory pathway. Lastly, we experimentally establish tissue specificity in expression of mouse Rabs and show that neo-functionalisation best explains the emergence of new human Rab subfamilies. With the Rabifier and RabDB, we provide tools that easily allows non-bioinformaticians to integrate thousands of Rabs in their analyses. RabDB is designed to enable the cell biology community to keep pace with the increasing number of fully-sequenced genomes and change the scale at which we perform comparative analysis in cell biology.
Rab 蛋白是小 GTP 酶,作为囊泡运输的必要调节剂。在人类中已知有 44 个亚家族,它们在不同的亚细胞定位和组织中执行特定的功能集。Rab 的功能在遥远的同源物中也是保守的。因此,Rab 的注释可以对运输的细胞生物学做出功能预测。到目前为止,注释 Rab 一直是一项繁琐的手动任务,对于当前和未来的深度测序技术的基因组产出来说是不可行的。我们开发、验证和基准测试了 Rabifier,这是一种用于识别和分类 Rab 的自动化生物信息学管道,它可以达到高达 90%的分类准确性。我们从覆盖整个真核生物树的 247 个基因组中大约注释了 8000 个 Rab。完整的 Rab 数据库和实现该管道的网络工具可在 www.RabDB.org 上公开获取。我们首次在一个涵盖整个真核生物系统发育的数据集描述和分析 Rab 的进化。我们发现了一个高度动态的家族,经历了频繁的分类特异性扩张和损失。我们使用系统发育分析来估计人类亚家族的起源,这增加了 Rab14、32 和 RabL4 人类祖先的 Rab 谱。此外,对有孔虫 Monosiga brevicollis Rab 家族的详细分析指出了伴随后生动物多细胞性出现的变化,主要是分泌途径的重要扩张和特化。最后,我们在实验中确定了小鼠 Rab 的组织特异性表达,并表明新功能化最好地解释了新的人类 Rab 亚家族的出现。有了 Rabifier 和 RabDB,我们提供了工具,使非生物信息学家能够在他们的分析中轻松地整合数千个 Rab。RabDB 的设计旨在使细胞生物学界能够跟上完全测序基因组数量的增加,并改变我们在细胞生物学中进行比较分析的规模。