Zeng Ping, Zhao Yang, Zhang Liwei, Huang Shuiping, Chen Feng
Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China; Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical College, Xuzhou, Jiangsu, China.
Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China.
PLoS One. 2014 Mar 27;9(3):e93355. doi: 10.1371/journal.pone.0093355. eCollection 2014.
This paper mainly utilizes likelihood-based tests to detect rare variants associated with a continuous phenotype under the framework of kernel machine learning. Both the likelihood ratio test (LRT) and the restricted likelihood ratio test (ReLRT) are investigated. The relationship between the kernel machine learning and the mixed effects model is discussed. By using the eigenvalue representation of LRT and ReLRT, their exact finite sample distributions are obtained in a simulation manner. Numerical studies are performed to evaluate the performance of the proposed approaches under the contexts of standard mixed effects model and kernel machine learning. The results have shown that the LRT and ReLRT can control the type I error correctly at the given α level. The LRT and ReLRT consistently outperform the SKAT, regardless of the sample size and the proportion of the negative causal rare variants, and suffer from fewer power reductions compared to the SKAT when both positive and negative effects of rare variants are present. The LRT and ReLRT performed under the context of kernel machine learning have slightly higher powers than those performed under the context of standard mixed effects model. We use the Genetic Analysis Workshop 17 exome sequencing SNP data as an illustrative example. Some interesting results are observed from the analysis. Finally, we give the discussion.
本文主要利用基于似然的检验方法,在核机器学习框架下检测与连续表型相关的罕见变异。研究了似然比检验(LRT)和受限似然比检验(ReLRT)。讨论了核机器学习与混合效应模型之间的关系。通过使用LRT和ReLRT的特征值表示,以模拟方式获得了它们精确的有限样本分布。进行了数值研究,以评估所提出方法在标准混合效应模型和核机器学习背景下的性能。结果表明,LRT和ReLRT能够在给定的α水平上正确控制I型错误。无论样本量和负因果罕见变异的比例如何,LRT和ReLRT始终优于SKAT,并且当罕见变异同时存在正向和负向效应时,与SKAT相比,其功效降低较少。在核机器学习背景下执行的LRT和ReLRT比在标准混合效应模型背景下执行的具有略高的功效。我们以遗传分析研讨会17外显子测序SNP数据为例进行说明。从分析中观察到了一些有趣的结果。最后,我们进行了讨论。