Ray Debashree, Li Xiang, Pan Wei, Pankow James S, Basu Saonli
Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minn., USA.
Hum Hered. 2015;79(2):69-79. doi: 10.1159/000369858. Epub 2015 Jun 3.
Genome-wide association studies (GWASs) have identified hundreds of genetic variants associated with complex diseases, but these variants appear to explain very little of the disease heritability. The typical single-locus association analysis in a GWAS fails to detect variants with small effect sizes and to capture higher-order interaction among these variants. Multilocus association analysis provides a powerful alternative by jointly modeling the variants within a gene or a pathway and by reducing the burden of multiple hypothesis testing in a GWAS.
Here, we propose a powerful and flexible dimension reduction approach to model multilocus association. We use a Bayesian partitioning model which clusters SNPs according to their direction of association, models higher-order interactions using a flexible scoring scheme and uses posterior marginal probabilities to detect association between the SNP set and the disease.
We illustrate our method using extensive simulation studies and applying it to detect multilocus interaction in Atherosclerosis Risk in Communities (ARIC) GWAS with type 2 diabetes.
We demonstrate that our approach has better power to detect multilocus interactions than several existing approaches. When applied to the ARIC study dataset with 9,328 individuals to study gene-based associations for type 2 diabetes, our method identified some novel variants not detected by conventional single-locus association analyses.
全基因组关联研究(GWAS)已鉴定出数百种与复杂疾病相关的基因变异,但这些变异似乎只能解释很少一部分疾病遗传力。GWAS中典型的单基因座关联分析无法检测到效应大小较小的变异,也无法捕捉这些变异之间的高阶相互作用。多位点关联分析通过对基因或通路内的变异进行联合建模,并减少GWAS中多重假设检验的负担,提供了一种强大的替代方法。
在此,我们提出一种强大且灵活的降维方法来对多位点关联进行建模。我们使用贝叶斯划分模型,该模型根据单核苷酸多态性(SNP)的关联方向对其进行聚类,使用灵活的评分方案对高阶相互作用进行建模,并使用后验边缘概率来检测SNP集与疾病之间的关联。
我们通过广泛的模拟研究来说明我们的方法,并将其应用于社区动脉粥样硬化风险(ARIC)GWAS中检测2型糖尿病的多位点相互作用。
我们证明,与几种现有方法相比,我们的方法在检测多位点相互作用方面具有更强的能力。当应用于包含9328名个体的ARIC研究数据集以研究2型糖尿病的基于基因的关联时,我们的方法识别出了一些传统单基因座关联分析未检测到的新变异。