Department of Statistics, Sookmyung Women's University, 52 Hyochangwon-gil, Yongsan-gu, Seoul 140-742, Korea.
Stat Med. 2010 Dec 10;29(28):2956-62. doi: 10.1002/sim.4042.
Interval-censored data are commonly found in studies of diseases that progress without symptoms, which require clinical evaluation for detection. Several techniques have been suggested with independent assumption. However, the assumption will not be valid if observations come from clusters. Furthermore, when the cluster size relates to response variables, commonly used methods can bring biased results. For example, in a study on lymphatic filariasis, a parasitic disease where worms make several nests in the infected person's lymphatic vessels and reside until adulthood, the response variable of interest is the nest-extinction times. As the extinction times of nests are checked by repeated ultrasound examinations, exact extinction times are not observed. Instead, data are composed of two examination points: the last examination time with living worms and the first examination time with dead worms. Furthermore, as Williamson et al. (Statistics in Medicine 2008; 27:543-555) pointed out, larger nests show a tendency for low clearance rates. This association has been denoted as an informative cluster size. To analyze the relationship between the numbers of nests and interval-censored nest-extinction times, this study proposes a joint model for the relationship between cluster size and clustered interval-censored failure data. A proportional hazard model with random effect and a mixed ordinal regression model are applied to failure times and cluster size, respectively. The joint model approach addresses both the association among failure times from the same cluster and the dependency of failure times on cluster size. Simulation studies are performed to assess the finite sample properties of the estimators and lymphatic filariasis data are analyzed as an illustration.
区间删失数据在研究无明显症状进展的疾病时经常出现,这些疾病需要临床评估才能发现。已经提出了几种基于独立假设的技术。然而,如果观察结果来自聚类,那么这种假设将不成立。此外,当聚类大小与响应变量相关时,常用的方法可能会带来有偏的结果。例如,在淋巴丝虫病的研究中,这是一种寄生虫病,蠕虫在感染者的淋巴管中形成多个巢穴,并在成年后居住,感兴趣的响应变量是巢穴灭绝时间。由于通过重复超声检查来检查巢穴的灭绝时间,因此无法观察到确切的灭绝时间。相反,数据由两个检查点组成:带有活虫的最后检查时间和带有死虫的第一次检查时间。此外,正如 Williamson 等人(Statistics in Medicine 2008; 27:543-555)指出的那样,较大的巢穴往往清除率较低。这种关联被表示为信息性聚类大小。为了分析巢数与区间删失巢灭绝时间之间的关系,本研究提出了一种用于聚类大小和聚类区间删失失效数据之间关系的联合模型。分别应用具有随机效应的比例风险模型和混合有序回归模型来拟合失效时间和聚类大小。联合模型方法解决了同一聚类中失效时间之间的关联以及失效时间对聚类大小的依赖性。进行了模拟研究以评估估计量的有限样本性质,并分析了淋巴丝虫病数据作为说明。