Hayeck Tristan J, Loh Po-Ru, Pollack Samuela, Gusev Alexander, Patterson Nick, Zaitlen Noah A, Price Alkes L
Institute for Genomic Medicine, Columbia University, New York, NY 10032, USA; Department of Biostatistics, Columbia University, New York, NY 10032, USA.
Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA.
Am J Hum Genet. 2017 Jan 5;100(1):31-39. doi: 10.1016/j.ajhg.2016.11.015. Epub 2016 Dec 22.
Mixed models have become the tool of choice for genetic association studies; however, standard mixed model methods may be poorly calibrated or underpowered under family sampling bias and/or case-control ascertainment. Previously, we introduced a liability threshold-based mixed model association statistic (LTMLM) to address case-control ascertainment in unrelated samples. Here, we consider family-biased case-control ascertainment, where case and control subjects are ascertained non-randomly with respect to family relatedness. Previous work has shown that this type of ascertainment can severely bias heritability estimates; we show here that it also impacts mixed model association statistics. We introduce a family-based association statistic (LT-Fam) that is robust to this problem. Similar to LTMLM, LT-Fam is computed from posterior mean liabilities (PML) under a liability threshold model; however, LT-Fam uses published narrow-sense heritability estimates to avoid the problem of biased heritability estimation, enabling correct calibration. In simulations with family-biased case-control ascertainment, LT-Fam was correctly calibrated (average χ = 1.00-1.02 for null SNPs), whereas the Armitage trend test (ATT), standard mixed model association (MLM), and case-control retrospective association test (CARAT) were mis-calibrated (e.g., average χ = 0.50-1.22 for MLM, 0.89-2.65 for CARAT). LT-Fam also attained higher power than other methods in some settings. In 1,259 type 2 diabetes-affected case subjects and 5,765 control subjects from the CARe cohort, downsampled to induce family-biased ascertainment, LT-Fam was correctly calibrated whereas ATT, MLM, and CARAT were again mis-calibrated. Our results highlight the importance of modeling family sampling bias in case-control datasets with related samples.
混合模型已成为基因关联研究的首选工具;然而,在家族抽样偏差和/或病例对照确定的情况下,标准混合模型方法的校准可能不佳或功效不足。此前,我们引入了一种基于易感性阈值的混合模型关联统计量(LTMLM)来解决非亲属样本中的病例对照确定问题。在此,我们考虑家族偏向性病例对照确定,即病例和对照受试者是根据家族相关性非随机确定的。先前的研究表明,这种确定方式会严重影响遗传力估计;我们在此表明,它也会影响混合模型关联统计量。我们引入了一种基于家族的关联统计量(LT-Fam),该统计量对这一问题具有稳健性。与LTMLM类似,LT-Fam是根据易感性阈值模型下的后验均值易感性(PML)计算得出的;然而,LT-Fam使用已发表的狭义遗传力估计值来避免遗传力估计偏差的问题,从而实现正确校准。在家族偏向性病例对照确定的模拟中,LT-Fam校准正确(无效单核苷酸多态性的平均χ=1.00-1.02),而阿米特奇趋势检验(ATT)、标准混合模型关联(MLM)和病例对照回顾性关联检验(CARAT)校准错误(例如,MLM的平均χ=0.50-1.22,CARAT的平均χ=0.89-2.65)。在某些情况下,LT-Fam的功效也高于其他方法。在来自CARe队列的1259名2型糖尿病病例受试者和5765名对照受试者中,通过下采样以诱导家族偏向性确定,LT-Fam校准正确,而ATT、MLM和CARAT再次校准错误。我们的结果突出了在具有相关样本的病例对照数据集中对家族抽样偏差进行建模的重要性。