School of Mathematical and Statistical Sciences, University of Galway, Galway, Ireland.
MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, United Kingdom.
PLoS Genet. 2023 Sep 18;19(9):e1010546. doi: 10.1371/journal.pgen.1010546. eCollection 2023 Sep.
Genome-wide association studies (GWAS) are commonly used to identify genomic variants that are associated with complex traits, and estimate the magnitude of this association for each variant. However, it has been widely observed that the association estimates of variants tend to be lower in a replication study than in the study that discovered those associations. A phenomenon known as Winner's Curse is responsible for this upward bias present in association estimates of significant variants in the discovery study. We review existing Winner's Curse correction methods which require only GWAS summary statistics in order to make adjustments. In addition, we propose modifications to improve existing methods and propose a novel approach which uses the parametric bootstrap. We evaluate and compare methods, first using a wide variety of simulated data sets and then, using real data sets for three different traits. The metric, estimated mean squared error (MSE) over significant SNPs, was primarily used for method assessment. Our results indicate that widely used conditional likelihood based methods tend to perform poorly. The other considered methods behave much more similarly, with our proposed bootstrap method demonstrating very competitive performance. To complement this review, we have developed an R package, 'winnerscurse' which can be used to implement these various Winner's Curse adjustment methods to GWAS summary statistics.
全基因组关联研究(GWAS)常用于识别与复杂性状相关的基因组变异,并估计每个变异与该性状的关联程度。然而,人们广泛观察到,在复制研究中,变异的关联估计值往往低于发现这些关联的研究。这种现象被称为赢家诅咒,它导致了在发现研究中显著变异的关联估计值存在向上偏差。我们回顾了现有的赢家诅咒校正方法,这些方法仅需要 GWAS 汇总统计数据来进行调整。此外,我们还提出了一些改进现有方法的方法,并提出了一种使用参数 bootstrap 的新方法。我们首先使用各种模拟数据集评估和比较方法,然后使用三个不同性状的真实数据集进行评估。主要使用显著 SNP 的估计均方误差(MSE)作为方法评估的指标。我们的结果表明,广泛使用的基于条件似然的方法往往表现不佳。其他考虑的方法表现得更为相似,我们提出的自举方法表现出非常有竞争力的性能。为了补充这一综述,我们开发了一个 R 包“winnerscurse”,可用于实现这些各种赢家诅咒调整方法到 GWAS 汇总统计数据。