Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France.
Univ Rennes, CNRS, Inria, IRISA-UMR 6074, Rennes, France.
Mol Ecol Resour. 2019 Mar;19(2):526-535. doi: 10.1111/1755-0998.12985.
Comparison of the molecular diversity in all plankton populations present in geographically distant water columns may allow for a holistic view of the connectivity, isolation and adaptation of organisms in the marine environment. In this context, a large-scale detection and analysis of genomic variants directly in metagenomic data appeared as a powerful strategy for the identification of genetic structures and genes under natural selection in plankton. Here, we used discosnp++, a reference-free variant caller, to produce genetic variants from large-scale metagenomic data and assessed its accuracy on the copepod Oithona nana in terms of variant calling, allele frequency estimation and population genomic statistics by comparing it to the state-of-the-art method. discosnp ++ produces variants leading to similar conclusions regarding the genetic structure and identification of loci under natural selection. discosnp++ was then applied to 120 metagenomic samples from four size fractions, including prokaryotes, protists and zooplankton sampled from 39 tara Oceans sampling stations located in the Atlantic Ocean and the Mediterranean Sea to produce a new set of marine genomic markers containing more than 19 million of variants. This new genomic resource can be used by the community to relocate these markers on their plankton genomes or transcriptomes of interest. This resource will be updated with new marine expeditions and the increase of metagenomic data (availability: http://bioinformatique.rennes.inria.fr/taravariants/).
比较地理上遥远的水柱中所有浮游生物种群的分子多样性,可以全面了解海洋环境中生物的连通性、隔离和适应。在这种情况下,直接在宏基因组数据中大规模检测和分析基因组变体似乎是识别浮游生物中遗传结构和受自然选择影响的基因的有力策略。在这里,我们使用了无参考变异调用程序 discosnp++,从大规模宏基因组数据中生成遗传变异,并通过将其与最先进的方法进行比较,评估其在桡足类动物 Oithona nana 中的变异调用、等位基因频率估计和群体基因组统计学方面的准确性。discosnp++产生的变异结果与关于遗传结构和识别自然选择下的基因座的结论相似。然后,将 discosnp++应用于从大西洋和地中海的 39 个 tara Oceans 采样站采集的四个大小分数的 120 个宏基因组样本中,包括原核生物、原生生物和浮游动物,以生成一组新的海洋基因组标记,其中包含超过 1900 万个变体。该社区可以使用这个新的基因组资源将这些标记重新定位到他们感兴趣的浮游生物基因组或转录组上。该资源将随着新的海洋考察和宏基因组数据的增加而更新(可访问性:http://bioinformatique.rennes.inria.fr/taravariants/)。