BHF Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK.
Heart and Lung Research Institute, University of Cambridge, Cambridge, UK.
Int J Epidemiol. 2023 Aug 2;52(4):1209-1219. doi: 10.1093/ije/dyac233.
Genetic associations for variants identified through genome-wide association studies (GWASs) tend to be overestimated in the original discovery data set as, if the association was underestimated, the variant may not have been detected. This bias, known as winner's curse, can affect Mendelian randomization estimates, but its severity and potential impact are unclear.
We performed an empirical investigation to assess the potential bias from winner's curse in practice. We considered Mendelian randomization estimates for the effect of body mass index (BMI) on coronary artery disease risk. We randomly divided a UK Biobank data set 100 times into three equally sized subsets. The first subset was treated as the 'discovery GWAS'. We compared genetic associations estimated in the discovery GWAS to those estimated in the other subsets for each of the 100 iterations.
For variants associated with BMI at P < 5 × 10-8 in at least one iteration, genetic associations with BMI were up to 5-fold greater in iterations in which the variant was associated with BMI at P < 5 × 10-8 compared with its mean association across all iterations. If the minimum P-value for association with BMI was P = 10-13 or lower, then this inflation was <25%. Mendelian randomization estimates were affected by winner's curse bias. However, bias did not materially affect results; all analyses indicated a deleterious effect of BMI on coronary artery disease risk.
Winner's curse can bias Mendelian randomization estimates, although its practical impact may not be substantial. If avoiding sample overlap is infeasible, analysts should consider performing a sensitivity analysis based on variants strongly associated with the exposure.
通过全基因组关联研究(GWAS)发现的变异体的遗传关联在原始发现数据集中往往被高估,因为如果关联被低估,该变异体可能无法被检测到。这种偏差被称为“赢家的诅咒”,它会影响孟德尔随机化估计,但严重程度和潜在影响尚不清楚。
我们进行了一项实证研究,以评估实践中“赢家的诅咒”的潜在偏差。我们考虑了孟德尔随机化估计体重指数(BMI)对冠心病风险的影响。我们将英国生物库数据集随机分为 100 个相等大小的子集。第一个子集被视为“发现 GWAS”。我们在 100 次迭代的每一次中,比较了在发现 GWAS 中估计的遗传关联与在其他子集估计的遗传关联。
对于在至少一次迭代中与 BMI 相关的 P<5×10-8 的变异体,与 BMI 相关的遗传关联在与 BMI 相关的 P<5×10-8 的迭代中最高可达 5 倍,而在所有迭代中的平均关联。如果与 BMI 关联的最小 P 值为 P=10-13 或更低,则这种膨胀<25%。孟德尔随机化估计受到“赢家的诅咒”偏差的影响。然而,偏差并没有实质性地影响结果;所有分析都表明 BMI 对冠心病风险有不良影响。
“赢家的诅咒”可能会使孟德尔随机化估计产生偏差,尽管其实际影响可能不大。如果避免样本重叠不可行,分析人员应考虑根据与暴露因素强烈相关的变异体进行敏感性分析。