Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA.
Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA.
Bioinformatics. 2022 Sep 15;38(18):4337-4343. doi: 10.1093/bioinformatics/btac459.
In the genome-wide association analysis of population-based biobanks, most diseases have low prevalence, which results in low detection power. One approach to tackle the problem is using family disease history, yet existing methods are unable to address type I error inflation induced by increased correlation of phenotypes among closely related samples, as well as unbalanced phenotypic distribution.
We propose a new method for genetic association test with family disease history, mixed-model-based Test with Adjusted Phenotype and Empirical saddlepoint approximation, which controls for increased phenotype correlation by adopting a two-variance-component mixed model, accounts for case-control imbalance by using empirical saddlepoint approximation, and is flexible to incorporate any existing adjusted phenotypes, such as phenotypes from the LT-FH method. We show through simulation studies and analysis of UK Biobank data of white British samples and the Korean Genome and Epidemiology Study of Korean samples that the proposed method is robust and yields better calibration compared to existing methods while gaining power for detection of variant-phenotype associations.
The summary statistics and code generated in this study are available at https://github.com/styvon/TAPE.
Supplementary data are available at Bioinformatics online.
在基于人群的生物库的全基因组关联分析中,大多数疾病的患病率较低,导致检测能力较低。一种解决该问题的方法是使用家族病史,但现有的方法无法解决由于密切相关样本中表型相关性增加而导致的 I 型错误膨胀,以及表型分布不平衡的问题。
我们提出了一种新的具有家族病史的遗传关联测试方法,混合模型的基于调整表型的测试和经验鞍点逼近,该方法通过采用两方差分量混合模型来控制表型相关性的增加,通过经验鞍点逼近来考虑病例对照不平衡,并灵活地结合任何现有的调整表型,例如 LT-FH 方法中的表型。我们通过模拟研究和对英国生物库中白种英国人样本和韩国基因组和流行病学研究中韩国样本的数据分析表明,与现有方法相比,所提出的方法具有稳健性和更好的校准效果,同时提高了检测变异-表型关联的能力。
本研究生成的汇总统计数据和代码可在 https://github.com/styvon/TAPE 上获得。
补充数据可在生物信息学在线获得。