Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut 06520.
Baker Institute for Animal Health, Cornell University, Ithaca, New York 14850.
Genetics. 2019 Dec;213(4):1225-1236. doi: 10.1534/genetics.119.302598. Epub 2019 Oct 7.
Longitudinal phenotypes have been increasingly available in genome-wide association studies (GWAS) and electronic health record-based studies for identification of genetic variants that influence complex traits over time. For longitudinal binary data, there remain significant challenges in gene mapping, including misspecification of the model for phenotype distribution due to ascertainment. Here, we propose L-BRAT (Longitudinal Binary-trait Retrospective Association Test), a retrospective, generalized estimating equation-based method for genetic association analysis of longitudinal binary outcomes. We also develop RGMMAT, a retrospective, generalized linear mixed model-based association test. Both tests are retrospective score approaches in which genotypes are treated as random conditional on phenotype and covariates. They allow both static and time-varying covariates to be included in the analysis. Through simulations, we illustrated that retrospective association tests are robust to ascertainment and other types of phenotype model misspecification, and gain power over previous association methods. We applied L-BRAT and RGMMAT to a genome-wide association analysis of repeated measures of cocaine use in a longitudinal cohort. Pathway analysis implicated association with opioid signaling and axonal guidance signaling pathways. Lastly, we replicated important pathways in an independent cocaine dependence case-control GWAS. Our results illustrate that L-BRAT is able to detect important loci and pathways in a genome scan and to provide insights into genetic architecture of cocaine use.
纵向表型在全基因组关联研究(GWAS)和基于电子健康记录的研究中越来越多地被应用,用于识别随时间影响复杂性状的遗传变异。对于纵向二项数据,基因映射仍然存在重大挑战,包括由于确定而导致表型分布模型的指定不当。在这里,我们提出了 L-BRAT(纵向二项式性状回溯关联测试),这是一种基于回顾性广义估计方程的方法,用于分析纵向二项式结果的遗传关联。我们还开发了 RGMMAT,这是一种基于回顾性广义线性混合模型的关联测试。这两种测试都是回顾性评分方法,其中基因型在给定表型和协变量的情况下被视为随机的。它们允许在分析中同时包含静态和时变协变量。通过模拟,我们表明回溯关联测试对确定和其他类型的表型模型指定不当具有稳健性,并比以前的关联方法更具效力。我们将 L-BRAT 和 RGMMAT 应用于纵向队列中可卡因使用的重复测量的全基因组关联分析。途径分析表明与阿片信号和轴突导向信号途径有关。最后,我们在独立的可卡因依赖病例对照 GWAS 中复制了重要途径。我们的结果表明,L-BRAT 能够在全基因组扫描中检测到重要的基因座和途径,并深入了解可卡因使用的遗传结构。