Architecture et Réactivité de l'ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, 15 rue René-Descartes, 67084 Strasbourg cedex, France.
C R Biol. 2011 Aug-Sep;334(8-9):671-8. doi: 10.1016/j.crvi.2011.05.016. Epub 2011 Jul 6.
The importance of ncRNAs in biological processes makes their annotation an essential component of any genome-sequencing project. The identification of ncRNAs in genomes requires specific expertise and tools that are distinct from the traditional protein gene annotation tools. Here, we describe the assembly of two automatic annotation pipelines, integrating publicly available tools, for homology and de novo ncRNA search in genomes. We applied both pipelines to 10 Saccharomycotina genomes and were able to find and annotate 693 ncRNA genes, corresponding to 81% of the ncRNAs expected for those genomes assuming the number of ncRNAs in Saccharomyces cerevisiae (86) as a reference. Several new ncRNAs, not yet known in the Saccharomycotina clade, were also detected. The results show the feasibility of automatic search for ncRNAs in full genomes and the utility of such approaches in large multi-genome sequencing and annotation projects.
ncRNAs 在生物过程中的重要性使得它们的注释成为任何基因组测序项目的重要组成部分。在基因组中识别 ncRNAs 需要特定的专业知识和工具,这些知识和工具与传统的蛋白质基因注释工具不同。在这里,我们描述了两个自动注释管道的组装,整合了公共可用的工具,用于基因组中的同源和从头 ncRNA 搜索。我们将这两个管道应用于 10 个 Saccharomycotina 基因组,并能够找到和注释 693 个 ncRNA 基因,这相当于那些基因组中预期的 ncRNA 数量的 81%,假设以酿酒酵母 (Saccharomyces cerevisiae) 中的 ncRNA 数量 (86) 作为参考。还检测到了几个在 Saccharomycotina 进化枝中尚未发现的新 ncRNA。结果表明,在全基因组中自动搜索 ncRNA 是可行的,并且这种方法在大型多基因组测序和注释项目中是有用的。