Francis Warren R, Wörheide Gert
Department of Earth and Environmental Sciences, Paleontology and Geobiology, Ludwig-Maximilians-Universität München, Munich, Germany.
GeoBio-Center, Ludwig-Maximilians-Universität München, Munich, Germany.
Genome Biol Evol. 2017 Jun 1;9(6):1582-1598. doi: 10.1093/gbe/evx103.
One central goal of genome biology is to understand how the usage of the genome differs between organisms. Our knowledge of genome composition, needed for downstream inferences, is critically dependent on gene annotations, yet problems associated with gene annotation and assembly errors are usually ignored in comparative genomics. Here, we analyze the genomes of 68 species across 12 animal phyla and some single-cell eukaryotes for general trends in genome composition and transcription, taking into account problems of gene annotation. We show that, regardless of genome size, the ratio of introns to intergenic sequence is comparable across essentially all animals, with nearly all deviations dominated by increased intergenic sequence. Genomes of model organisms have ratios much closer to 1:1, suggesting that the majority of published genomes of nonmodel organisms are underannotated and consequently omit substantial numbers of genes, with likely negative impact on evolutionary interpretations. Finally, our results also indicate that most animals transcribe half or more of their genomes arguing against differences in genome usage between animal groups, and also suggesting that the transcribed portion is more dependent on genome size than previously thought.
基因组生物学的一个核心目标是了解基因组的使用方式在不同生物体之间是如何不同的。我们对下游推断所需的基因组组成的了解,严重依赖于基因注释,然而在比较基因组学中,与基因注释和组装错误相关的问题通常被忽略。在这里,我们分析了12个动物门的68个物种以及一些单细胞真核生物的基因组,以研究基因组组成和转录的一般趋势,同时考虑到基因注释问题。我们表明,无论基因组大小如何,几乎所有动物的内含子与基因间序列的比例都是相当的,几乎所有的偏差都是由基因间序列增加所主导的。模式生物的基因组比例更接近1:1,这表明大多数已发表的非模式生物基因组注释不足,因此遗漏了大量基因,这可能会对进化解释产生负面影响。最后,我们的结果还表明,大多数动物转录其基因组的一半或更多,这与动物群体之间基因组使用的差异相悖,也表明转录部分比以前认为的更依赖于基因组大小。