Xing Haipeng, Ying Zhiliang
Department of Applied Mathematics and Statistics, State University of New York at Stony Brook, Stony Brook, NY 11794.
J Am Stat Assoc. 2012 Dec 1;107(500). doi: 10.1080/01621459.2012.712425.
Many longitudinal studies involve relating an outcome process to a set of possibly time-varying covariates, giving rise to the usual regression models for longitudinal data. When the purpose of the study is to investigate the covariate effects when experimental environment undergoes abrupt changes or to locate the periods with different levels of covariate effects, a simple and easy-to-interpret approach is to introduce change-points in regression coefficients. In this connection, we propose a semiparametric change-point regression model, in which the error process (stochastic component) is nonparametric and the baseline mean function (functional part) is completely unspecified, the observation times are allowed to be subject-specific, and the number, locations and magnitudes of change-points are unknown and need to be estimated. We further develop an estimation procedure which combines the recent advance in semiparametric analysis based on counting process argument and multiple change-points inference, and discuss its large sample properties, including consistency and asymptotic normality, under suitable regularity conditions. Simulation results show that the proposed methods work well under a variety of scenarios. An application to a real data set is also given.
许多纵向研究涉及将一个结果过程与一组可能随时间变化的协变量联系起来,从而产生了用于纵向数据的常见回归模型。当研究目的是调查实验环境发生突然变化时的协变量效应,或定位具有不同协变量效应水平的时期时,一种简单且易于解释的方法是在回归系数中引入变化点。在此背景下,我们提出了一种半参数变化点回归模型,其中误差过程(随机成分)是非参数的,基线均值函数(函数部分)完全未指定,观测时间允许因个体而异,并且变化点的数量、位置和大小是未知的,需要进行估计。我们进一步开发了一种估计程序,该程序结合了基于计数过程论证的半参数分析和多个变化点推断的最新进展,并在适当的正则条件下讨论了其大样本性质,包括一致性和渐近正态性。模拟结果表明,所提出的方法在各种情况下都能很好地工作。还给出了一个实际数据集的应用。