Hayeck Tristan J, Zaitlen Noah A, Loh Po-Ru, Vilhjalmsson Bjarni, Pollack Samuela, Gusev Alexander, Yang Jian, Chen Guo-Bo, Goddard Michael E, Visscher Peter M, Patterson Nick, Price Alkes L
Department of Biostatistics, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA.
Lung Biology Center, School of Medicine, University of California, San Francisco, San Francisco, CA 94158,USA.
Am J Hum Genet. 2015 May 7;96(5):720-30. doi: 10.1016/j.ajhg.2015.03.004. Epub 2015 Apr 16.
We introduce a liability-threshold mixed linear model (LTMLM) association statistic for case-control studies and show that it has a well-controlled false-positive rate and more power than existing mixed-model methods for diseases with low prevalence. Existing mixed-model methods suffer a loss in power under case-control ascertainment, but no solution has been proposed. Here, we solve this problem by using a χ(2) score statistic computed from posterior mean liabilities (PMLs) under the liability-threshold model. Each individual's PML is conditional not only on that individual's case-control status but also on every individual's case-control status and the genetic relationship matrix (GRM) obtained from the data. The PMLs are estimated with a multivariate Gibbs sampler; the liability-scale phenotypic covariance matrix is based on the GRM, and a heritability parameter is estimated via Haseman-Elston regression on case-control phenotypes and then transformed to the liability scale. In simulations of unrelated individuals, the LTMLM statistic was correctly calibrated and achieved higher power than existing mixed-model methods for diseases with low prevalence, and the magnitude of the improvement depended on sample size and severity of case-control ascertainment. In a Wellcome Trust Case Control Consortium 2 multiple sclerosis dataset with >10,000 samples, LTMLM was correctly calibrated and attained a 4.3% improvement (p = 0.005) in χ(2) statistics over existing mixed-model methods at 75 known associated SNPs, consistent with simulations. Larger increases in power are expected at larger sample sizes. In conclusion, case-control studies of diseases with low prevalence can achieve power higher than that in existing mixed-model methods.
我们为病例对照研究引入了一种责任阈值混合线性模型(LTMLM)关联统计量,并表明对于低患病率疾病,该统计量具有良好控制的假阳性率,且比现有的混合模型方法更具检验效能。现有的混合模型方法在病例对照抽样情况下检验效能会降低,但尚未提出解决方案。在此,我们通过使用从责任阈值模型下的后验均值责任(PML)计算得到的χ(2)评分统计量来解决这个问题。每个个体的PML不仅取决于该个体的病例对照状态,还取决于每个个体的病例对照状态以及从数据中获得的遗传关系矩阵(GRM)。PML通过多元吉布斯采样器进行估计;责任尺度表型协方差矩阵基于GRM,并且通过对病例对照表型进行哈斯曼 - 埃尔斯顿回归估计遗传力参数,然后将其转换到责任尺度。在无关个体的模拟中,对于低患病率疾病,LTMLM统计量校准正确,并且比现有的混合模型方法具有更高的检验效能,改进的幅度取决于样本量和病例对照抽样的严重程度。在一个拥有超过10,000个样本的威康信托病例对照研究联盟2多发性硬化症数据集中,LTMLM校准正确,在75个已知的相关单核苷酸多态性(SNP)处,χ(2)统计量比现有的混合模型方法提高了4.3%(p = 0.005),与模拟结果一致。预计在更大样本量时检验效能会有更大提高。总之,对于低患病率疾病的病例对照研究可以实现比现有混合模型方法更高的检验效能。