Department of Health Studies, University of Chicago, Chicago, Illinois 60637, USA.
Genet Epidemiol. 2012 Jul;36(5):517-24. doi: 10.1002/gepi.21644. Epub 2012 May 29.
Genome-wide association (GWA) studies have identified several pancreatic cancer (PanCa) susceptibility loci. Methods for assessment of polygenic susceptibility can be employed to detect the collective effect of additional association signals for PanCa. Using data on 492,651 autosomal single nucleotide polymorphisms (SNPs) from the PanScan GWA study (2,857 cases, 2,967 controls), we employed polygenic risk score (PRS) cross-validation (CV) methods to (a) confirm the existence of unidentified association signals, (b) assess the predictive value of PRSs, and (c) assess evidence for polygenic effects in specific genomic locations (genic vs. intergenic). After excluding SNPs in known PanCa susceptibility regions, we constructed PRS models using a training GWA dataset and then tested the model in an independent testing dataset using fourfold CV. We also employed a "power-replication" approach, where power to detect SNP associations was calculated using a training dataset, and power was tested for association with "replication status" in a testing dataset. PRS scores constructed using ≥ 10% of genome-wide SNPs showed significant association with PanCa (P< 0.05) across the majority of CV analyses. Associations were stronger for PRSs restricted to genic SNPs compared to intergenic PRSs. The power-replications approach produced weaker associations that were not significant when restricting to SNPs with low pairwise linkage disequilibrium, whereas PRS results were robust to such restrictions. Although the PRS approach will not dramatically improve PanCa prediction, it provides strong evidence for unidentified association signals for PanCa. Our results suggest that focusing association studies on genic regions and conducting larger GWA studies can reveal additional PanCa susceptibility loci.
全基因组关联(GWA)研究已经确定了几个胰腺癌(PanCa)易感性位点。评估多基因易感性的方法可用于检测 PanCa 的其他关联信号的综合效应。使用 PanScan GWA 研究中 492651 个常染色体单核苷酸多态性(SNP)的数据(2857 例病例,2967 例对照),我们采用多基因风险评分(PRS)交叉验证(CV)方法来:(a)确认未识别的关联信号的存在;(b)评估 PRS 的预测价值;(c)评估特定基因组位置(基因内与基因间)多基因效应的证据。在排除已知的 PanCa 易感性区域中的 SNP 后,我们使用训练 GWA 数据集构建 PRS 模型,然后使用四折 CV 在独立测试数据集中测试模型。我们还采用了“功效复制”方法,其中使用训练数据集计算检测 SNP 关联的功效,然后在测试数据集中使用“复制状态”测试功效。使用≥10%全基因组 SNPs 构建的 PRS 评分在大多数 CV 分析中与 PanCa 显著相关(P<0.05)。与基因间 PRS 相比,限制在基因 SNP 上的 PRS 显示出更强的关联。当限制为低成对连锁不平衡的 SNP 时,功效复制方法产生的关联较弱且不显著,而 PRS 结果不受此类限制的影响。虽然 PRS 方法不会显著改善 PanCa 预测,但它为 PanCa 的未识别关联信号提供了强有力的证据。我们的研究结果表明,将关联研究集中在基因区域并进行更大规模的 GWA 研究可以揭示更多的 PanCa 易感性位点。