Bible Joe, St Ville Madeleine, Albert Paul S, Liu Danping
School of Mathematical and Statistical Sciences, Clemson University, Clemson, SC, USA.
Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, USA.
Stat Methods Med Res. 2024 Feb;33(2):243-255. doi: 10.1177/09622802231225527. Epub 2024 Feb 1.
When extracting medical record data to form a retrospective cohort, investigators typically focus on a pre-specified study window, and select subjects who had hospital visits during that study window. However, such data extraction may suffer from an informative observation process, since sicker patients may have hospital visits more frequently. For example, Consecutive Pregnancy Study is a retrospective cohort study of women with multiple pregnancies in 23 Utah hospitals from 2003 to 2010, where the interest is to understand the risk factors of recurrent pregnancy outcomes, such as preterm birth. The observation process is informative in the sense that, women with adverse pregnancy outcomes may be less likely/willing/able to endure subsequent pregnancies. We proposed a three-part joint model with shared random effects structure to address this analytic complication. Particularly, a first-order transition model is used to model the longitudinal binary outcome; a gamma regression model is assumed for the inter-pregnancy intervals; a continuation ratio model specifies the probability of continuing with more births in the future. We note that the latter two parts give rise to a parametric cure-rate survival model. The performance of the proposed method was examined in extensive simulation studies, with both correctly and mis-specified models. The analyses of Consecutive Pregnancy Study data further demonstrate the inadequacies of fitting the transition model alone ignoring the informative observation process.
在提取病历数据以形成回顾性队列时,研究人员通常会关注预先指定的研究窗口,并选择在该研究窗口期间有过医院就诊记录的受试者。然而,这种数据提取可能会受到信息性观察过程的影响,因为病情较重的患者可能更频繁地去医院就诊。例如,连续妊娠研究是一项对2003年至2010年期间在犹他州23家医院有多次妊娠的女性进行的回顾性队列研究,其目的是了解复发性妊娠结局(如早产)的风险因素。从某种意义上说,观察过程是有信息性的,即有不良妊娠结局的女性可能不太可能/不愿意/无法忍受后续妊娠。我们提出了一个具有共享随机效应结构的三部分联合模型来解决这一分析复杂性问题。具体而言,使用一阶转移模型对纵向二元结局进行建模;假设妊娠间隔服从伽马回归模型;延续比例模型指定未来继续生育更多孩子的概率。我们注意到后两部分产生了一个参数化治愈率生存模型。在广泛的模拟研究中,对正确设定和错误设定的模型都检验了所提出方法的性能。对连续妊娠研究数据的分析进一步证明了仅拟合转移模型而忽略信息性观察过程的不足之处。