Botts Carsten H, Daniels Michael J
Department of Mathematics and Statistics, Williams College, Williamstown, Massachusetts 01267, USA,
Comput Stat Data Anal. 2008 Aug 15;52(12):5100-5120. doi: 10.1016/j.csda.2008.05.008.
We model sparse functional data from multiple subjects with a mixed-effects regression spline. In this model, the expected values for any subject (conditioned on the random effects) can be written as the sum of a population curve and a subject-specific deviate from this population curve. The population curve and the subject-specific deviates are both modeled as free-knot b-splines with k and k' knots located at t(k) and t(k'), respectively. To identify the number and location of the "free" knots, we sample from the posterior p (k, t(k), k', t(k')|y) using reversible jump MCMC methods. Sampling from this posterior distribution is complicated, however, by the flexibility we allow for the model's covariance structure. No restrictions (other than positive definiteness) are placed on the covariance parameters ψ and σ(2) and, as a result, no analytical form for the likelihood p (y|k, t(k), k', t(k')) exists. In this paper, we consider two approximations to p(y|k, t(k), k', t(k')) and then sample from the corresponding approximations to p(k, t(k), k', t(k')|y). We also sample from p(k, t(k), k', t(k'), ψ, σ(2)|y) which has a likelihood that is available in closed form. While sampling from this larger posterior is less efficient, the resulting marginal distribution of knots is exact and allows us to evaluate the accuracy of each approximation. We then consider a real data set and explore the difference between p(k, t(k), k', t(k'), ψ, σ(2)|y) and the more accurate approximation to p(k, t(k), k', t(k')|y).
我们使用混合效应回归样条对来自多个受试者的稀疏功能数据进行建模。在这个模型中,任何受试者的期望值(以随机效应为条件)可以写成总体曲线与该总体曲线的受试者特定偏差之和。总体曲线和受试者特定偏差都被建模为自由节点B样条,分别在t(k)和t(k')处有k和k'个节点。为了确定“自由”节点的数量和位置,我们使用可逆跳跃MCMC方法从后验分布p(k, t(k), k', t(k')|y)中进行抽样。然而,由于我们允许模型的协方差结构具有灵活性,从这个后验分布中抽样变得复杂。对协方差参数ψ和σ(2)没有任何限制(除了正定),因此不存在似然函数p(y|k, t(k), k', t(k'))的解析形式。在本文中,我们考虑对p(y|k, t(k), k', t(k'))的两种近似,然后从相应的对p(k, t(k), k', t(k')|y)的近似中进行抽样。我们还从p(k, t(k), k', t(k'), ψ, σ(2)|y)中进行抽样,其似然函数具有封闭形式。虽然从这个更大的后验分布中抽样效率较低,但得到的节点边际分布是精确的,并且使我们能够评估每种近似的准确性。然后我们考虑一个实际数据集,并探索p(k, t(k), k', t(k'), ψ, σ(2)|y)与对p(k, t(k), k', t(k')|y)更精确近似之间的差异。