Clinical Research Center, Nagasaki University Hospital, 1-7-1 Sakamoto, Nagasaki, Nagasaki 852-8501, Japan.
Biostatistics, Graduate School of Medicine, Kurume University, 67 Asahi-machi, Kurume, Fukuoka, 830-0011, Japan.
PLoS One. 2018 Jul 5;13(7):e0199692. doi: 10.1371/journal.pone.0199692. eCollection 2018.
In genome-wide association studies (GWASs) for binary traits (or case-control samples) in the presence of covariates to be adjusted for, researchers often use a logistic regression model to test variants for disease association. Popular tests include Wald, likelihood ratio, and score tests. For likelihood ratio test and Wald test, maximum likelihood estimation (MLE), which requires iterative procedure, must be computed for each single nucleotide polymorphism (SNP). In contrast, the score test only requires MLE under the null model, being lower in computational cost than other tests. Usually, genotype data include missing genotypes because of assay failures. It loses computational efficiency in the conventional score test (CST), which requires null estimation by excluding individuals with missing genotype for each SNP. In this study, we propose two new score tests, called PM1 and PM2, that use a single global null estimator for all SNPs regardless of missing genotypes, thereby enabling faster computation than CST. We prove that PM2 and CST have an equivalent asymptotic power and that the power of PM1 is asymptotically lower than that of PM2. We evaluate the performance of the proposed methods in terms of type I error rates and power by simulation studies and application to real GWAS data provided by the Alzheimer's Disease Neuroimaging Initiative (ADNI), confirming our theoretical results. ADNI-GWAS application demonstrated that the proposed score tests improve computational speed about 6-18 times faster than the existing tests, CST, Wald tests and likelihood ratio tests. Our score tests are general and applicable to other regression models.
在存在要调整的协变量的二元性状(或病例对照样本)的全基因组关联研究(GWAS)中,研究人员通常使用逻辑回归模型来测试变体与疾病的关联。流行的检验包括 Wald、似然比和得分检验。对于似然比检验和 Wald 检验,必须为每个单核苷酸多态性(SNP)计算最大似然估计(MLE),这需要迭代过程。相比之下,得分检验仅需要在零假设下的 MLE,计算成本低于其他检验。通常,由于检测失败,基因型数据包括缺失的基因型。在传统的得分检验(CST)中,由于需要为每个 SNP 排除缺失基因型的个体进行零假设估计,因此会损失计算效率。在这项研究中,我们提出了两种新的得分检验,称为 PM1 和 PM2,它们使用单个全局零假设估计器来处理所有 SNP,而不管缺失基因型如何,从而比 CST 计算速度更快。我们证明了 PM2 和 CST 具有等效的渐近功效,并且 PM1 的功效渐近低于 PM2。我们通过模拟研究和对阿尔茨海默病神经影像学倡议(ADNI)提供的真实 GWAS 数据的应用来评估所提出方法的性能,根据 I 型错误率和功效来评估,证实了我们的理论结果。ADNI-GWAS 应用表明,所提出的得分检验比现有的 CST、Wald 检验和似然比检验提高了大约 6-18 倍的计算速度。我们的得分检验是通用的,适用于其他回归模型。