Université Montpellier 2, CNRS UMR 5554, Institut des Sciences de l'Evolution de Montpellier, Montpellier, France.
PLoS Genet. 2013 Apr;9(4):e1003457. doi: 10.1371/journal.pgen.1003457. Epub 2013 Apr 11.
In animals, the population genomic literature is dominated by two taxa, namely mammals and drosophilids, in which fully sequenced, well-annotated genomes have been available for years. Data from other metazoan phyla are scarce, probably because the vast majority of living species still lack a closely related reference genome. Here we achieve de novo, reference-free population genomic analysis from wild samples in five non-model animal species, based on next-generation sequencing transcriptome data. We introduce a pipe-line for cDNA assembly, read mapping, SNP/genotype calling, and data cleaning, with specific focus on the issue of hidden paralogy detection. In two species for which a reference genome is available, similar results were obtained whether the reference was used or not, demonstrating the robustness of our de novo inferences. The population genomic profile of a hare, a turtle, an oyster, a tunicate, and a termite were found to be intermediate between those of human and Drosophila, indicating that the discordant genomic diversity patterns that have been reported between these two species do not reflect a generalized vertebrate versus invertebrate gap. The genomic average diversity was generally higher in invertebrates than in vertebrates (with the notable exception of termite), in agreement with the notion that population size tends to be larger in the former than in the latter. The non-synonymous to synonymous ratio, however, did not differ significantly between vertebrates and invertebrates, even though it was negatively correlated with genetic diversity within each of the two groups. This study opens promising perspective regarding genome-wide population analyses of non-model organisms and the influence of population size on non-synonymous versus synonymous diversity.
在动物中,群体基因组学文献主要集中在两个分类群,即哺乳动物和果蝇,这些分类群的基因组已经完全测序并进行了很好的注释,多年来一直可用。其他后生动物门的数据很少,这可能是因为绝大多数现存物种仍然缺乏密切相关的参考基因组。在这里,我们基于下一代测序转录组数据,从五个非模式动物物种的野生样本中实现了无参考的从头参考基因组群体基因组分析。我们介绍了一个 cDNA 组装、读映射、SNP/基因型调用和数据清理的管道,特别关注隐藏的旁系同源检测问题。在有参考基因组的两个物种中,无论是否使用参考基因组,都得到了相似的结果,这证明了我们的从头推断的稳健性。发现野兔、海龟、牡蛎、被囊动物和白蚁的群体基因组特征处于人类和果蝇之间,表明这两个物种之间报道的不一致的基因组多样性模式并不反映广义的脊椎动物与无脊椎动物之间的差距。无意义突变与同义突变的比例在无脊椎动物中普遍高于脊椎动物(白蚁除外),这与群体大小在前一类中比在后一类中更大的观点一致。然而,无意义突变与同义突变的比例在脊椎动物和无脊椎动物之间没有显著差异,尽管它与两组内的遗传多样性呈负相关。这项研究为非模式生物的全基因组群体分析以及群体大小对非同义与同义多样性的影响开辟了有希望的前景。