Department of Biostatistics, Mailman School of Public Health, Columbia University, NY 10032, U.S.A.
Stat Med. 2011 Jul 10;30(15):1883-97. doi: 10.1002/sim.4236. Epub 2011 Apr 13.
Longitudinal data are routinely collected in biomedical research studies. A natural model describing longitudinal data decomposes an individual's outcome as the sum of a population mean function and random subject-specific deviations. When parametric assumptions are too restrictive, methods modeling the population mean function and the random subject-specific functions nonparametrically are in demand. In some applications, it is desirable to estimate a covariance function of random subject-specific deviations. In this work, flexible yet computationally efficient methods are developed for a general class of semiparametric mixed effects models, where the functional forms of the population mean and the subject-specific curves are unspecified. We estimate nonparametric components of the model by penalized spline (P-spline, Biometrics 2001; 57:253-259), and reparameterize the random curve covariance function by a modified Cholesky decomposition (Biometrics 2002; 58:121-128) which allows for unconstrained estimation of a positive-semidefinite matrix. To provide smooth estimates, we penalize roughness of fitted curves and derive closed-form solutions in the maximization step of an EM algorithm. In addition, we present models and methods for longitudinal family data where subjects in a family are correlated and we decompose the covariance function into a subject-level source and observation-level source. We apply these methods to the multi-level Framingham Heart Study data to estimate age-specific heritability of systolic blood pressure nonparametrically.
生物医学研究中经常会收集纵向数据。描述纵向数据的自然模型将个体的结果分解为总体均值函数和随机个体特定偏差的总和。当参数假设过于严格时,需要使用非参数方法来建模总体均值函数和随机个体特定函数。在某些应用中,希望估计随机个体特定偏差的协方差函数。在这项工作中,针对一类广义半参数混合效应模型,开发了灵活且计算效率高的方法,其中总体均值和个体特定曲线的函数形式未指定。我们通过惩罚样条(Biometrics 2001;57:253-259)估计模型的非参数分量,并通过修正的 Cholesky 分解(Biometrics 2002;58:121-128)重新参数化随机曲线协方差函数,这允许对正定半定矩阵进行无约束估计。为了提供平滑的估计,我们惩罚拟合曲线的粗糙度,并在 EM 算法的最大化步骤中推导出闭式解。此外,我们还提出了用于纵向家庭数据的模型和方法,其中家庭中的个体具有相关性,我们将协方差函数分解为个体水平源和观测水平源。我们将这些方法应用于多层次 Framingham 心脏研究数据,以非参数方式估计收缩压的年龄特异性遗传性。