Touloumi G, Pocock S J, Babiker A G, Darbyshire J H
Medical Statistics Unit, London School of Hygiene and Tropical Medicine, U.K.
Stat Med. 1999 May 30;18(10):1215-33. doi: 10.1002/(sici)1097-0258(19990530)18:10<1215::aid-sim118>3.0.co;2-6.
Many cohort studies and clinical trials have designs which involve repeated measurements of disease markers. One problem in such longitudinal studies, when the primary interest is to estimate and to compare the evolution of a disease marker, is that planned data are not collected because of missing data due to missing visits and/or withdrawal or attrition (for example, death). Several methods to analyse such data are available, provided that the data are missing at random. However, serious biases can occur when missingness is informative. In such cases, one needs to apply methods that simultaneously model the observed data and the missingness process. In this paper we consider the problem of estimation of the rate of change of a disease marker in longitudinal studies, in which some subjects drop out prematurely (informatively) due to attrition, while others experience a non-informative drop-out process (end of study, withdrawal). We propose a method which combines a linear random effects model for the underlying pattern of the marker with a log-normal survival model for the informative drop-out process. Joint estimates are obtained through the restricted iterative generalized least squares method which are equivalent to restricted maximum likelihood estimates. A nested EM algorithm is applied to deal with censored survival data. The advantages of this method are: it provides a unified approach to estimate all the model parameters; it can effectively deal with irregular data (that is, measured at irregular time points), a complicated covariance structure and a complex underlying profile of the response variable; it does not entail such complex computation as would be required to maximize the joint likelihood. The method is illustrated by modelling CD4 count data in a clinical trial in patients with advanced HIV infection while its performance is tested by simulation studies.
许多队列研究和临床试验的设计都涉及对疾病标志物的重复测量。在这类纵向研究中,当主要兴趣在于估计和比较疾病标志物的演变时,一个问题是由于访视缺失和/或退出或损耗(例如死亡)导致的数据缺失,计划收集的数据未被收集。如果数据是随机缺失的,有几种分析此类数据的方法可用。然而,当缺失情况具有信息性时,可能会出现严重偏差。在这种情况下,需要应用同时对观测数据和缺失过程进行建模的方法。在本文中,我们考虑纵向研究中疾病标志物变化率的估计问题,其中一些受试者由于损耗而提前(具有信息性地)退出,而另一些受试者经历非信息性退出过程(研究结束、退出)。我们提出一种方法,该方法将标志物潜在模式的线性随机效应模型与信息性退出过程的对数正态生存模型相结合。通过限制迭代广义最小二乘法获得联合估计值,这些估计值等同于限制最大似然估计值。应用嵌套期望最大化(EM)算法来处理删失生存数据。该方法的优点是:它提供了一种统一的方法来估计所有模型参数;它可以有效地处理不规则数据(即在不规则时间点测量的数据)、复杂的协方差结构和响应变量的复杂潜在概况;它不需要像最大化联合似然那样复杂的计算。通过对晚期HIV感染患者临床试验中的CD4计数数据进行建模来说明该方法,同时通过模拟研究测试其性能。