Goldstein Benjamin A, Yang Lingyao, Salfati Elias, Assimes Themistoclies L
Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina, United States of America.
Quantitative Sciences Unit, Stanford School of Medicine, Palo Alto, California, United States of America.
Genet Epidemiol. 2015 Sep;39(6):439-45. doi: 10.1002/gepi.21912. Epub 2015 Jul 22.
Genetic risk scores are an increasingly popular tool for summarizing the cumulative risk of a set of Single Nucleotide Polymorphisms (SNPs) with disease. Typically only the set of the SNPs that have reached genome-wide significance compose these scores. However recent work suggests that including additional SNPs may aid risk assessment. In this paper, we used the Atherosclerosis Risk in Communities (ARIC) Study cohort to illustrate how one can choose the optimal set of SNPs for a genetic risk score (GRS). In addition to P-value threshold, we also examined linkage disequilibrium, imputation quality, and imputation type. We provide a variety of evaluation metrics. Results suggest that P-value threshold had the greatest impact on GRS quality for the outcome of coronary heart disease, with an optimal threshold around 0.001. However, GRSs are relatively robust to both linkage disequilibrium and imputation quality. We also show that the optimal GRS partially depends on the evaluation metric and consequently the way one intends to use the GRS. Overall the implications highlight both the robustness of GRS and a means to empirically choose the best set of GRSs.
遗传风险评分是一种越来越受欢迎的工具,用于总结一组单核苷酸多态性(SNP)与疾病相关的累积风险。通常只有达到全基因组显著性的SNP集合才构成这些评分。然而,最近的研究表明,纳入更多的SNP可能有助于风险评估。在本文中,我们使用社区动脉粥样硬化风险(ARIC)研究队列来说明如何为遗传风险评分(GRS)选择最佳的SNP集合。除了P值阈值外,我们还研究了连锁不平衡、插补质量和插补类型。我们提供了多种评估指标。结果表明,对于冠心病结局,P值阈值对GRS质量的影响最大,最佳阈值约为0.001。然而,GRS对连锁不平衡和插补质量都相对稳健。我们还表明,最佳GRS部分取决于评估指标,因此也取决于使用GRS的方式。总体而言,这些结果凸显了GRS的稳健性以及一种通过实证选择最佳GRS集合的方法。