用于分析复发事件面板数据的半参数模型。

Balshaw Robert F, Dean C B

Simon Fraser University, Burnaby, British Columbia, Canada.

Biometrics. 2002 Jun;58(2):324-31. doi: 10.1111/j.0006-341x.2002.00324.x.

In many longitudinal studies, interest focuses on the occurrence rate of some phenomenon for the subjects in the study. When the phenomenon is nonterminating and possibly recurring, the result is a recurrent-event data set. Examples include epileptic seizures and recurrent cancers. When the recurring event is detectable only by an expensive or invasive examination, only the number of events occurring between follow-up times may be available. This article presents a semiparametric model for such data, based on a multiplicative intensity model paired with a fully flexible nonparametric baseline intensity function. A random subject-specific effect is included in the intensity model to account for the overdispersion frequently displayed in count data. Estimators are determined from quasi-likelihood estimating functions. Because only first- and second-moment assumptions are required for quasi-likelihood, the method is more robust than those based on the specification of a full parametric likelihood. Consistency of the estimators depends only on the assumption of the proportional intensity model. The semiparametric estimators are shown to be highly efficient compared with the usual parametric estimators. As with semiparametric methods in survival analysis, the method provides useful diagnostics for specific parametric models, including a quasi-score statistic for testing specific baseline intensity functions. The techniques are used to analyze cancer recurrences and a pheromone-based mating disruption experiment in moths. A simulation study confirms that, for many practical situations, the estimators possess appropriate small-sample characteristics.

在许多纵向研究中，关注点在于研究对象中某种现象的发生率。当该现象是无终止且可能反复出现时，结果就是一个复发事件数据集。例子包括癫痫发作和复发性癌症。当复发事件只能通过昂贵或侵入性检查才能检测到时，可能仅能获得随访期间发生的事件数量。本文基于一个乘性强度模型与一个完全灵活的非参数基线强度函数，为此类数据提出了一个半参数模型。强度模型中包含一个随机的个体特定效应，以解释计数数据中经常出现的过度离散现象。估计量由拟似然估计函数确定。由于拟似然仅需要一阶矩和二阶矩假设，该方法比基于完全参数似然设定的方法更稳健。估计量的一致性仅取决于比例强度模型的假设。与通常的参数估计量相比，半参数估计量显示出更高的效率。与生存分析中的半参数方法一样，该方法为特定的参数模型提供了有用的诊断方法，包括用于检验特定基线强度函数的拟得分统计量。这些技术被用于分析癌症复发情况以及飞蛾中基于信息素的交配干扰实验。一项模拟研究证实，在许多实际情况下，估计量具有合适的小样本特征。