Pan Wei
Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455–0392, USA.
Genet Epidemiol. 2011 May;35(4):211-6. doi: 10.1002/gepi.20567.
To detect genetic association with common and complex diseases, two powerful yet quite different multimarker association tests have been proposed, genomic distance-based regression (GDBR) (Wessel and Schork [2006] Am J Hum Genet 79:821–833) and kernel machine regression (KMR) (Kwee et al. [2008] Am J Hum Genet 82:386–397; Wu et al. [2010] Am J Hum Genet 86:929–942). GDBR is based on relating a multimarker similarity metric for a group of subjects to variation in their trait values, while KMR is based on nonparametric estimates of the effects of the multiple markers on the trait through a kernel function or kernel matrix. Since the two approaches are both powerful and general, but appear quite different, it is important to know their specific relationships. In this report, we show that, under the condition that there is no other covariate, there is a striking correspondence between the two approaches for a quantitative or a binary trait: if the same positive semi-definite matrix is used as the centered similarity matrix in GDBR and as the kernel matrix in KMR, the F-test statistic in GDBR and the score test statistic in KMR are equal (up to some ignorable constants). The result is based on the connections of both methods to linear or logistic (random-effects) regression models.
为了检测与常见复杂疾病的基因关联,人们提出了两种强大但截然不同的多标记关联测试方法,即基于基因组距离的回归(GDBR)(韦塞尔和肖尔克[2006]《美国人类遗传学杂志》79:821 - 833)和核机器回归(KMR)(奎伊等人[2008]《美国人类遗传学杂志》82:386 - 397;吴等人[2010]《美国人类遗传学杂志》86:929 - 942)。GDBR基于将一组受试者的多标记相似性度量与他们的性状值变化相关联,而KMR基于通过核函数或核矩阵对多个标记对性状的影响进行非参数估计。由于这两种方法都很强大且通用,但看起来差异很大,了解它们的具体关系很重要。在本报告中,我们表明,在没有其他协变量的情况下,对于定量或二元性状,这两种方法之间存在显著的对应关系:如果在GDBR中使用相同的正定矩阵作为中心化相似性矩阵,在KMR中作为核矩阵,那么GDBR中的F检验统计量和KMR中的得分检验统计量是相等的(忽略一些可忽略的常数)。该结果基于这两种方法与线性或逻辑(随机效应)回归模型的联系。