Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts 02115, USA.
Genet Epidemiol. 2011 Nov;35(7):620-31. doi: 10.1002/gepi.20610. Epub 2011 Aug 4.
In this article, we develop a powerful test for identifying single nucleotide polymorphism (SNP)-sets that are predictive of survival with data from genome-wide association studies. We first group typed SNPs into SNP-sets based on genomic features and then apply a score test to assess the overall effect of each SNP-set on the survival outcome through a kernel machine Cox regression framework. This approach uses genetic information from all SNPs in the SNP-set simultaneously and accounts for linkage disequilibrium (LD), leading to a powerful test with reduced degrees of freedom when the typed SNPs are in LD with each other. This type of test also has the advantage of capturing the potentially nonlinear effects of the SNPs, SNP-SNP interactions (epistasis), and the joint effects of multiple causal variants. By simulating SNP data based on the LD structure of real genes from the HapMap project, we demonstrate that our proposed test is more powerful than the standard single SNP minimum P-value-based test for association studies with censored survival outcomes. We illustrate the proposed test with a real data application.
在本文中,我们开发了一种强大的测试方法,用于识别与全基因组关联研究数据相关的可预测生存的单核苷酸多态性 (SNP) 集。我们首先根据基因组特征将已分型的 SNPs 分组到 SNP 集中,然后通过核机器 Cox 回归框架应用评分检验来评估每个 SNP 集对生存结果的总体影响。这种方法同时利用 SNP 集中所有 SNPs 的遗传信息,并考虑连锁不平衡 (LD),从而在已分型的 SNPs 相互 LD 时,减少自由度,得到强大的检验。这种检验还具有捕捉 SNPs、SNP-SNP 相互作用(上位性)以及多个因果变异的联合效应的潜在非线性效应的优势。通过基于 HapMap 项目中真实基因的 LD 结构模拟 SNP 数据,我们证明了我们提出的检验对于带有删失生存结局的关联研究比基于标准单 SNP 最小 P 值的检验更有效。我们通过实际数据应用说明了所提出的检验。