Luo Yu, Stephens David A, Verma Aman, Buckeridge David L
Department of Mathematics and Statistics, McGill University, Quebec, Canada.
Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Quebec, Canada.
Biometrics. 2021 Mar;77(1):78-90. doi: 10.1111/biom.13261. Epub 2020 Mar 31.
Large amounts of longitudinal health records are now available for dynamic monitoring of the underlying processes governing the observations. However, the health status progression across time is not typically observed directly: records are observed only when a subject interacts with the system, yielding irregular and often sparse observations. This suggests that the observed trajectories should be modeled via a latent continuous-time process potentially as a function of time-varying covariates. We develop a continuous-time hidden Markov model to analyze longitudinal data accounting for irregular visits and different types of observations. By employing a specific missing data likelihood formulation, we can construct an efficient computational algorithm. We focus on Bayesian inference for the model: this is facilitated by an expectation-maximization algorithm and Markov chain Monte Carlo methods. Simulation studies demonstrate that these approaches can be implemented efficiently for large data sets in a fully Bayesian setting. We apply this model to a real cohort where patients suffer from chronic obstructive pulmonary disease with the outcome being the number of drugs taken, using health care utilization indicators and patient characteristics as covariates.
现在有大量纵向健康记录可用于动态监测支配观测结果的潜在过程。然而,健康状况随时间的进展通常无法直接观察到:只有当受试者与系统交互时才会观察到记录,从而产生不规则且往往稀疏的观测结果。这表明应通过潜在的连续时间过程对观察到的轨迹进行建模,该过程可能是随时间变化的协变量的函数。我们开发了一种连续时间隐马尔可夫模型来分析纵向数据,该模型考虑了不规则就诊和不同类型的观测结果。通过采用特定的缺失数据似然性公式,我们可以构建一种高效的计算算法。我们专注于该模型的贝叶斯推断:期望最大化算法和马尔可夫链蒙特卡罗方法有助于实现这一点。模拟研究表明,这些方法可以在完全贝叶斯设置下有效地应用于大型数据集。我们将此模型应用于一个真实队列,该队列中的患者患有慢性阻塞性肺疾病,以服用的药物数量为结果,使用医疗保健利用指标和患者特征作为协变量。