Liu Dawei, Lin Xihong, Ghosh Debashis
Center for Statistical Sciences, Brown University, Providence, Rhode Island 02912, USA.
Biometrics. 2007 Dec;63(4):1079-88. doi: 10.1111/j.1541-0420.2007.00799.x.
We consider a semiparametric regression model that relates a normal outcome to covariates and a genetic pathway, where the covariate effects are modeled parametrically and the pathway effect of multiple gene expressions is modeled parametrically or nonparametrically using least-squares kernel machines (LSKMs). This unified framework allows a flexible function for the joint effect of multiple genes within a pathway by specifying a kernel function and allows for the possibility that each gene expression effect might be nonlinear and the genes within the same pathway are likely to interact with each other in a complicated way. This semiparametric model also makes it possible to test for the overall genetic pathway effect. We show that the LSKM semiparametric regression can be formulated using a linear mixed model. Estimation and inference hence can proceed within the linear mixed model framework using standard mixed model software. Both the regression coefficients of the covariate effects and the LSKM estimator of the genetic pathway effect can be obtained using the best linear unbiased predictor in the corresponding linear mixed model formulation. The smoothing parameter and the kernel parameter can be estimated as variance components using restricted maximum likelihood. A score test is developed to test for the genetic pathway effect. Model/variable selection within the LSKM framework is discussed. The methods are illustrated using a prostate cancer data set and evaluated using simulations.
我们考虑一个半参数回归模型,该模型将正态结果与协变量和遗传通路相关联,其中协变量效应采用参数化建模,多个基因表达的通路效应使用最小二乘核机器(LSKM)进行参数化或非参数化建模。这个统一的框架通过指定核函数,为通路内多个基因的联合效应提供了一个灵活的函数,并允许每个基因表达效应可能是非线性的,且同一通路内的基因可能以复杂的方式相互作用。这个半参数模型还使得检验整体遗传通路效应成为可能。我们表明,LSKM半参数回归可以用线性混合模型来表述。因此,估计和推断可以在使用标准混合模型软件的线性混合模型框架内进行。协变量效应的回归系数和遗传通路效应的LSKM估计量都可以在相应的线性混合模型表述中使用最佳线性无偏预测器获得。平滑参数和核参数可以使用限制最大似然法作为方差分量进行估计。开发了一个得分检验来检验遗传通路效应。讨论了LSKM框架内的模型/变量选择。使用前列腺癌数据集对这些方法进行了说明,并通过模拟进行了评估。