Zhang Yuan, Lin Shili, Biswas Swati
Department of Mathematical Sciences, University of Texas at Dallas, Richardson, Texas 75080, U.S.A.
Department of Statistics, The Ohio State University, Columbus, Ohio 43210, U.S.A.
Biometrics. 2017 Mar;73(1):344-355. doi: 10.1111/biom.12567. Epub 2016 Aug 1.
Finding rare variants and gene-environment interactions (GXE) is critical in dissecting complex diseases. We consider the problem of detecting GXE where G is a rare haplotype and E is a nongenetic factor. Such methods typically assume G-E independence, which may not hold in many applications. A pertinent example is lung cancer-there is evidence that variants on Chromosome 15q25.1 interact with smoking to affect the risk. However, these variants are associated with smoking behavior rendering the assumption of G-E independence inappropriate. With the motivation of detecting GXE under G-E dependence, we extend an existing approach, logistic Bayesian LASSO, which assumes G-E independence (LBL-GXE-I) by modeling G-E dependence through a multinomial logistic regression (referred to as LBL-GXE-D). Unlike LBL-GXE-I, LBL-GXE-D controls type I error rates in all situations; however, it has reduced power when G-E independence holds. To control type I error without sacrificing power, we further propose a unified approach, LBL-GXE, to incorporate uncertainty in the G-E independence assumption by employing a reversible jump Markov chain Monte Carlo method. Our simulations show that LBL-GXE has power similar to that of LBL-GXE-I when G-E independence holds, yet has well-controlled type I errors in all situations. To illustrate the utility of LBL-GXE, we analyzed a lung cancer dataset and found several significant interactions in the 15q25.1 region, including one between a specific rare haplotype and smoking.
发现罕见变异和基因-环境相互作用(GXE)对于剖析复杂疾病至关重要。我们考虑检测GXE的问题,其中G是一种罕见单倍型,E是一个非遗传因素。此类方法通常假定G与E相互独立,但在许多应用中这可能并不成立。一个相关的例子是肺癌——有证据表明15号染色体q25.1区域的变异与吸烟相互作用会影响患病风险。然而,这些变异与吸烟行为相关,使得G与E相互独立的假设并不恰当。出于在G与E存在依赖性的情况下检测GXE的动机,我们扩展了一种现有方法——逻辑贝叶斯套索法(该方法假定G与E相互独立,即LBL - GXE - I),通过多项逻辑回归对G与E的依赖性进行建模(称为LBL - GXE - D)。与LBL - GXE - I不同,LBL - GXE - D在所有情况下都能控制第一类错误率;然而,当G与E相互独立成立时,其检验效能会降低。为了在不牺牲检验效能的情况下控制第一类错误,我们进一步提出一种统一方法LBL - GXE,通过采用可逆跳跃马尔可夫链蒙特卡罗方法来纳入G与E相互独立假设中的不确定性。我们的模拟结果表明,当G与E相互独立成立时,LBL - GXE的检验效能与LBL - GXE - I相似,但在所有情况下都能很好地控制第一类错误。为了说明LBL - GXE的效用,我们分析了一个肺癌数据集,并在15q25.1区域发现了几个显著的相互作用,包括一种特定罕见单倍型与吸烟之间的相互作用。