Department of Mathematics and Department of Surgical Sciences, Uppsala University, Regional Cancer Center Midsweden, Uppsala University Hospital, Uppsala, Sweden.
Stat Methods Med Res. 2023 Apr;32(4):806-819. doi: 10.1177/09622802231155010. Epub 2023 Feb 12.
We consider the analysis of longitudinal data of multiple types of events where some of the events are observed on a coarser level (e.g. grouped) at some time points during the follow-up, for example, when certain events, such as disease progression, are only observable during parts of follow-up for some subjects, causing gaps in the data, or when the time of death is observed but the cause of death is unknown. In this case, there is missing data in key characteristics of the event history such as onset, time in state, and number of events. We derive the likelihood function, score and observed information under independent and non-informative coarsening, and conduct a simulation study where we compare bias, empirical standard errors, and confidence interval coverage of estimators based on direct maximum likelihood, Monte Carlo Expectation Maximisation, ignoring the coarsening thus acting as if no event occurred, and artificial right censoring at the first time of coarsening. Longitudinal data on drug prescriptions and survival in men receiving palliative treatment for prostate cancer is used to estimate the parameters of one of the data-generating models. We demonstrate that the performance depends on several factors, including sample size and type of coarsening.
我们考虑分析多种类型事件的纵向数据,其中一些事件在随访过程中的某些时间点上以较粗的水平(例如分组)进行观察,例如,当某些事件(如疾病进展)仅在某些受试者的随访部分时间内可观察到时,就会导致数据出现空白,或者当观察到死亡时间但死因未知时也是如此。在这种情况下,事件历史的关键特征(如发病、状态时间和事件数量)中存在缺失数据。我们推导出了在独立和非信息性粗化下的似然函数、得分和观测信息,并进行了一项模拟研究,比较了基于直接最大似然、蒙特卡罗期望最大化、忽略粗化(因此表现为没有事件发生)和在第一次粗化时人为右删失的估计量的偏差、经验标准误差和置信区间覆盖率。使用接受前列腺癌姑息治疗的男性的药物处方和生存的纵向数据来估计一个数据生成模型的参数。我们表明,性能取决于多个因素,包括样本量和粗化类型。