Suppr超能文献

起源很重要:使用本地参考基因组可提高群体基因组学中的度量指标。

Origin matters: Using a local reference genome improves measures in population genomics.

机构信息

School of Biological and Chemical Sciences, Queen Mary University of London, London, UK.

Department of Life Sciences, Imperial College London, London, UK.

出版信息

Mol Ecol Resour. 2023 Oct;23(7):1706-1723. doi: 10.1111/1755-0998.13838. Epub 2023 Jul 25.

Abstract

Genome sequencing enables answering fundamental questions about the genetic basis of adaptation, population structure and epigenetic mechanisms. Yet, we usually need a suitable reference genome for mapping population-level resequencing data. In some model systems, multiple reference genomes are available, giving the challenging task of determining which reference genome best suits the data. Here, we compared the use of two different reference genomes for the three-spined stickleback (Gasterosteus aculeatus), one novel genome derived from a European gynogenetic individual and the published reference genome of a North American individual. Specifically, we investigated the impact of using a local reference versus one generated from a distinct lineage on several common population genomics analyses. Through mapping genome resequencing data of 60 sticklebacks from across Europe and North America, we demonstrate that genetic distance among samples and the reference genomes impacts downstream analyses. Using a local reference genome increased mapping efficiency and genotyping accuracy, effectively retaining more and better data. Despite comparable distributions of the metrics generated across the genome using SNP data (i.e. π, Tajima's D and F ), window-based statistics using different references resulted in different outlier genes and enriched gene functions. A marker-based analysis of DNA methylation distributions had a comparably high overlap in outlier genes and functions, yet with distinct differences depending on the reference genome. Overall, our results highlight how using a local reference genome decreases reference bias to increase confidence in downstream analyses of the data. Such results have significant implications in all reference-genome-based population genomic analyses.

摘要

基因组测序使我们能够回答有关适应、种群结构和表观遗传机制的遗传基础的基本问题。然而,我们通常需要合适的参考基因组来映射群体水平的重测序数据。在一些模式系统中,有多个参考基因组,这就需要确定哪个参考基因组最适合数据的具有挑战性的任务。在这里,我们比较了两种不同的三刺鱼(Gasterosteus aculeatus)参考基因组的使用,一种是来自欧洲雌核个体的新基因组,另一种是已发表的北美个体参考基因组。具体来说,我们研究了使用本地参考基因组与来自不同谱系的参考基因组对几种常见的群体基因组分析的影响。通过对来自欧洲和北美的 60 条三刺鱼的基因组重测序数据进行映射,我们证明了样本和参考基因组之间的遗传距离会影响下游分析。使用本地参考基因组增加了映射效率和基因分型准确性,有效地保留了更多更好的数据。尽管使用 SNP 数据生成的基因组上的指标分布(即π、Tajima's D 和 F)具有可比性,但使用不同参考基因组的窗口统计数据会导致不同的异常基因和富集的基因功能。基于标记的 DNA 甲基化分布分析在外源基因和功能上具有较高的重叠,但由于参考基因组的不同,仍存在明显的差异。总体而言,我们的结果强调了使用本地参考基因组可以减少参考偏差,从而提高对数据下游分析的置信度。这些结果在所有基于参考基因组的群体基因组分析中都具有重要意义。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验