Department of Marine Science, University of Otago, Dunedin, New Zealand.
Department of Anatomy, University of Otago, Dunedin, New Zealand.
Mol Ecol Resour. 2022 Oct;22(7):2599-2613. doi: 10.1111/1755-0998.13646. Epub 2022 Jun 5.
Reduced representation sequencing (RRS) is a widely used method to assay the diversity of genetic loci across the genome of an organism. The dominant class of RRS approaches assay loci associated with restriction sites within the genome (restriction site associated DNA sequencing, or RADseq). RADseq is frequently applied to non-model organisms since it enables population genetic studies without relying on well-characterized reference genomes. However, RADseq requires the use of many bioinformatic filters to ensure the quality of genotyping calls. These filters can have direct impacts on population genetic inference, and therefore require careful consideration. One widely used filtering approach is the removal of loci that do not conform to expectations of Hardy-Weinberg equilibrium (HWE). Despite being widely used, we show that this filtering approach is rarely described in sufficient detail to enable replication. Furthermore, through analyses of in silico and empirical data sets we show that some of the most widely used HWE filtering approaches dramatically impact inference of population structure. In particular, the removal of loci exhibiting departures from HWE after pooling across samples significantly reduces the degree of inferred population structure within a data set (despite this approach being widely used). Based on these results, we provide recommendations for best practice regarding the implementation of HWE filtering for RADseq data sets.
简化代表性测序(RRS)是一种广泛用于检测生物体基因组中遗传基因座多样性的方法。RRS 方法的主要类别是检测与基因组内限制位点相关的基因座(与限制位点相关的 DNA 测序,或 RADseq)。RADseq 经常应用于非模式生物,因为它可以在不依赖特征明确的参考基因组的情况下进行种群遗传研究。然而,RADseq 需要使用许多生物信息学过滤器来确保基因分型呼叫的质量。这些过滤器会直接影响种群遗传推断,因此需要仔细考虑。一种广泛使用的过滤方法是去除不符合 Hardy-Weinberg 平衡(HWE)预期的基因座。尽管这种过滤方法被广泛使用,但我们表明,很少有研究详细描述这种过滤方法,从而无法进行复制。此外,通过对计算机模拟和实证数据集的分析,我们表明,一些最广泛使用的 HWE 过滤方法会极大地影响对种群结构的推断。特别是,在跨样本汇总后去除偏离 HWE 的基因座会显著降低数据集内推断出的种群结构程度(尽管这种方法被广泛使用)。基于这些结果,我们为 RADseq 数据集实施 HWE 过滤提供了最佳实践建议。