AgResearch, Invermay Agricultural Centre, Mosgiel 9053, New Zealand
AgResearch, Invermay Agricultural Centre, Mosgiel 9053, New Zealand.
Genetics. 2018 Jun;209(2):389-400. doi: 10.1534/genetics.118.300831. Epub 2018 Mar 27.
High-throughput sequencing methods that multiplex a large number of individuals have provided a cost-effective approach for discovering genome-wide genetic variation in large populations. These sequencing methods are increasingly being utilized in population genetic studies across a diverse range of species. Two side-effects of these methods, however, are (1) sequencing errors and (2) heterozygous genotypes called as homozygous due to only one allele at a particular locus being sequenced, which occurs when the sequencing depth is insufficient. Both of these errors have a profound effect on the estimation of linkage disequilibrium (LD) and, if not taken into account, lead to inaccurate estimates. We developed a new likelihood method, GUS-LD, to estimate pairwise linkage disequilibrium using low coverage sequencing data that accounts for undercalled heterozygous genotypes and sequencing errors. Our findings show that accurate estimates were obtained using GUS-LD, whereas underestimation of LD results if no adjustment is made for the errors.
高通量测序方法可以对大量个体进行多重测序,为在大群体中发现全基因组遗传变异提供了一种具有成本效益的方法。这些测序方法越来越多地被应用于各种物种的群体遗传学研究。然而,这些方法有两个副作用:(1)测序错误和(2)由于在特定基因座上仅测序一个等位基因而被称为纯合的杂合基因型,当测序深度不足时,就会发生这种情况。这两种错误都会对连锁不平衡(LD)的估计产生深远影响,如果不加以考虑,会导致估计不准确。我们开发了一种新的似然方法 GUS-LD,用于使用低覆盖测序数据估计成对的连锁不平衡,该方法考虑了低调用的杂合基因型和测序错误。我们的研究结果表明,使用 GUS-LD 可以获得准确的估计值,而如果不对错误进行调整,则会导致 LD 估计值低估。