Yang Xiaowei, Nie Kun
Division of Biostatistics, Department of Public Health Sciences, School of Medicine, University of California, Davis, CA 95616, USA.
Stat Med. 2008 Mar 15;27(6):845-63. doi: 10.1002/sim.2952.
Longitudinal data sets in biomedical research often consist of large numbers of repeated measures. In many cases, the trajectories do not look globally linear or polynomial, making it difficult to summarize the data or test hypotheses using standard longitudinal data analysis based on various linear models. An alternative approach is to apply the approaches of functional data analysis, which directly target the continuous nonlinear curves underlying discretely sampled repeated measures. For the purposes of data exploration, many functional data analysis strategies have been developed based on various schemes of smoothing, but fewer options are available for making causal inferences regarding predictor-outcome relationships, a common task seen in hypothesis-driven medical studies. To compare groups of curves, two testing strategies with good power have been proposed for high-dimensional analysis of variance: the Fourier-based adaptive Neyman test and the wavelet-based thresholding test. Using a smoking cessation clinical trial data set, this paper demonstrates how to extend the strategies for hypothesis testing into the framework of functional linear regression models (FLRMs) with continuous functional responses and categorical or continuous scalar predictors. The analysis procedure consists of three steps: first, apply the Fourier or wavelet transform to the original repeated measures; then fit a multivariate linear model in the transformed domain; and finally, test the regression coefficients using either adaptive Neyman or thresholding statistics. Since a FLRM can be viewed as a natural extension of the traditional multiple linear regression model, the development of this model and computational tools should enhance the capacity of medical statistics for longitudinal data.
生物医学研究中的纵向数据集通常包含大量重复测量数据。在许多情况下,轨迹并非全局线性或多项式的,这使得使用基于各种线性模型的标准纵向数据分析来汇总数据或检验假设变得困难。另一种方法是应用功能数据分析方法,该方法直接针对离散采样的重复测量数据背后的连续非线性曲线。出于数据探索的目的,已经基于各种平滑方案开发了许多功能数据分析策略,但对于在假设驱动的医学研究中常见的预测变量与结果关系进行因果推断的选项较少。为了比较曲线组,针对高维方差分析提出了两种具有良好功效的检验策略:基于傅里叶的自适应奈曼检验和基于小波的阈值检验。本文使用戒烟临床试验数据集,展示了如何将假设检验策略扩展到具有连续功能响应和分类或连续标量预测变量的功能线性回归模型(FLRM)框架中。分析过程包括三个步骤:首先,对原始重复测量数据应用傅里叶变换或小波变换;然后在变换域中拟合多元线性模型;最后,使用自适应奈曼统计量或阈值统计量检验回归系数。由于FLRM可以看作是传统多元线性回归模型的自然扩展,该模型和计算工具的开发应增强医学统计学处理纵向数据的能力。