Sitlani Colleen M, Dupuis Josée, Rice Kenneth M, Sun Fangui, Pitsillides Achilleas N, Cupples L Adrienne, Psaty Bruce M
Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA.
Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA.
Eur J Hum Genet. 2016 Jul;24(7):1022-8. doi: 10.1038/ejhg.2015.253. Epub 2015 Dec 2.
Gene-environment interactions may provide a mechanism for targeting interventions to those individuals who would gain the most benefit from them. Searching for interactions agnostically on a genome-wide scale requires large sample sizes, often achieved through collaboration among multiple studies in a consortium. Family studies can contribute to consortia, but to do so they must account for correlation within families by using specialized analytic methods. In this paper, we investigate the performance of methods that account for within-family correlation, in the context of gene-environment interactions with binary exposures and quantitative outcomes. We simulate both cross-sectional and longitudinal measurements, and analyze the simulated data taking family structure into account, via generalized estimating equations (GEE) and linear mixed-effects models. With sufficient exposure prevalence and correct model specification, all methods perform well. However, when models are misspecified, mixed modeling approaches have seriously inflated type I error rates. GEE methods with robust variance estimates are less sensitive to model misspecification; however, when exposures are infrequent, GEE methods require modifications to preserve type I error rate. We illustrate the practical use of these methods by evaluating gene-drug interactions on fasting glucose levels in data from the Framingham Heart Study, a cohort that includes related individuals.
基因-环境相互作用可能为将干预措施靶向应用于那些能从中获得最大益处的个体提供一种机制。在全基因组范围内无差别地搜索相互作用需要大样本量,这通常通过研究联盟中多项研究的合作来实现。家系研究可为联盟做出贡献,但要做到这一点,它们必须通过使用专门的分析方法来考虑家系内的相关性。在本文中,我们在二元暴露和定量结局的基因-环境相互作用背景下,研究考虑家系内相关性的方法的性能。我们模拟了横断面和纵向测量,并通过广义估计方程(GEE)和线性混合效应模型,在考虑家庭结构的情况下分析模拟数据。在有足够的暴露患病率和正确的模型设定时,所有方法都表现良好。然而,当模型设定错误时,混合建模方法的I型错误率会严重膨胀。具有稳健方差估计的GEE方法对模型设定错误不太敏感;然而,当暴露不常见时,GEE方法需要进行修正以保持I型错误率。我们通过评估弗雷明汉心脏研究(一个包含相关个体的队列)数据中空腹血糖水平上的基因-药物相互作用,来说明这些方法的实际应用。