Su Yu-Ru, Di Chong-Zhi, Hsu Li
Biostatistics and Biomathematics Program, Public Health Science Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue N, Seattle, WA 98109, USA
Biostatistics. 2017 Jan;18(1):119-131. doi: 10.1093/biostatistics/kxw034. Epub 2016 Jul 28.
The development of next-generation sequencing technologies has allowed researchers to study comprehensively the contribution of genetic variation particularly rare variants to complex diseases. To date many sequencing analyses of rare variants have focused on marginal genetic effects and have not explored the potential role environmental factors play in modifying genetic risk. Analysis of gene-environment interaction (GxE) for rare variants poses considerable challenges because of variant rarity and paucity of subjects who carry the variants while being exposed. To tackle this challenge, we propose a hierarchical model to jointly assess the GxE effects of a set of rare variants for example, in a gene or regulatory region, leveraging the information across the variants. Under this model, GxE is modeled by two components. The first component incorporates variant functional information as weights to calculate the weighted burden of variant alleles across variants, and then assess their GxE interaction with the environmental factor. Since this information is a priori known, this component is fixed effects in the model. The second component involves residual GxE effects that have not been accounted for by the fixed effects. In this component, the residual GxE effects are postulated to follow an unspecified distribution with mean 0 and variance [Formula: see text] We develop a novel testing procedure by deriving two independent score statistics for the fixed effects and the variance component separately. We propose two data-adaptive combination approaches for combining these two score statistics and establish the asymptotic distributions. An extensive simulation study shows that the proposed approaches maintain the correct type I error and the power is comparable to or better than existing methods under a wide range of scenarios. Finally we illustrate the proposed methods by a exome-wide GxE analysis with NSAIDs use in colorectal cancer.
下一代测序技术的发展使研究人员能够全面研究基因变异尤其是罕见变异对复杂疾病的贡献。迄今为止,许多罕见变异的测序分析都集中在边际遗传效应上,尚未探索环境因素在改变遗传风险中所起的潜在作用。由于变异的稀有性以及携带变异同时又暴露于环境中的个体数量稀少,对罕见变异进行基因-环境相互作用(GxE)分析面临着巨大挑战。为应对这一挑战,我们提出了一种分层模型,以联合评估例如在一个基因或调控区域中的一组罕见变异的GxE效应,利用变异间的信息。在该模型下,GxE由两个部分建模。第一部分将变异功能信息作为权重,计算变异等位基因在各个变异上的加权负担,然后评估它们与环境因素的GxE相互作用。由于此信息是先验已知的,这一部分在模型中是固定效应。第二部分涉及固定效应未考虑的残余GxE效应。在这一部分中,假定残余GxE效应服从均值为0、方差为[公式:见原文]的未指定分布。我们通过分别为固定效应和方差分量推导两个独立的得分统计量,开发了一种新颖的检验程序。我们提出了两种数据自适应组合方法来组合这两个得分统计量,并建立了渐近分布。广泛的模拟研究表明,所提出的方法在广泛的场景下保持了正确的I型错误率,并且功效与现有方法相当或更好。最后,我们通过对非甾体抗炎药用于结直肠癌的全外显子组GxE分析来说明所提出的方法。