Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong, Hong Kong.
Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong, Hong Kong.
Biometrics. 2023 Sep;79(3):2010-2022. doi: 10.1111/biom.13795. Epub 2022 Dec 15.
Clustered data frequently arise in biomedical studies, where observations, or subunits, measured within a cluster are associated. The cluster size is said to be informative, if the outcome variable is associated with the number of subunits in a cluster. In most existing work, the informative cluster size issue is handled by marginal approaches based on within-cluster resampling, or cluster-weighted generalized estimating equations. Although these approaches yield consistent estimation of the marginal models, they do not allow estimation of within-cluster associations and are generally inefficient. In this paper, we propose a semiparametric joint model for clustered interval-censored event time data with informative cluster size. We use a random effect to account for the association among event times of the same cluster as well as the association between event times and the cluster size. For estimation, we propose a sieve maximum likelihood approach and devise a computationally-efficient expectation-maximization algorithm for implementation. The estimators are shown to be strongly consistent, with the Euclidean components being asymptotically normal and achieving semiparametric efficiency. Extensive simulation studies are conducted to evaluate the finite-sample performance, efficiency and robustness of the proposed method. We also illustrate our method via application to a motivating periodontal disease dataset.
在生物医学研究中,经常会出现聚类数据,其中在一个聚类中测量的观测值或子单元是相关的。如果因变量与聚类中的子单元数量相关,则称聚类大小为信息性的。在大多数现有工作中,基于聚类内重采样或聚类加权广义估计方程的边际方法来处理信息性聚类大小问题。尽管这些方法可以对边际模型进行一致估计,但它们不允许估计聚类内的关联,并且通常效率低下。在本文中,我们提出了一种用于具有信息性聚类大小的聚类区间 censored 事件时间数据的半参数联合模型。我们使用随机效应来解释同一聚类中事件时间之间的关联以及事件时间与聚类大小之间的关联。对于估计,我们提出了一种筛最大似然方法,并设计了一种计算高效的期望最大化算法来实现。该估计量具有强一致性,欧几里得分量渐近正态,并实现了半参数效率。通过广泛的模拟研究评估了所提出方法的有限样本性能、效率和稳健性。我们还通过应用于一个有启发性的牙周病数据集来说明我们的方法。