Kendall Emmett B, Williams Jonathan P, Hermansen Gudmund H, Bois Frederic, Thanh Vo Hong
Department of Statistics, North Carolina State University.
Centre for Advanced Study, Norwegian Academy of Science and Letters.
J Comput Graph Stat. 2025;34(2):668-682. doi: 10.1080/10618600.2024.2388609. Epub 2024 Sep 30.
Multistate Markov models are a canonical parametric approach for data modeling of observed or latent stochastic processes supported on a finite state space. Continuous-time Markov processes describe data that are observed irregularly over time, as is often the case in longitudinal medical data, for example. Assuming that a continuous-time Markov process is time-homogeneous, a closed-form likelihood function can be derived from the Kolmogorov forward equations - a system of differential equations with a well-known matrix-exponential solution. Unfortunately, however, the forward equations do not admit an analytical solution for continuous-time, time- Markov processes, and so researchers and practitioners often make the simplifying assumption that the process is piecewise time-homogeneous. In this paper, we provide intuitions and illustrations of the potential biases for parameter estimation that may ensue in the more realistic scenario that the piecewise-homogeneous assumption is violated, and we advocate for a solution for likelihood computation in a truly time-inhomogeneous fashion. Particular focus is afforded to the context of multistate Markov models that allow for state label misclassifications, which applies more broadly to hidden Markov models (HMMs), and Bayesian computations bypass the necessity for computationally demanding numerical gradient approximations for obtaining maximum likelihood estimates (MLEs). Supplemental materials are available online.
多状态马尔可夫模型是一种典型的参数方法,用于对有限状态空间上的观测或潜在随机过程进行数据建模。连续时间马尔可夫过程描述随时间不规则观测的数据,例如在纵向医学数据中经常出现这种情况。假设连续时间马尔可夫过程是时间齐次的,则可以从柯尔莫哥洛夫前向方程导出一个闭式似然函数,柯尔莫哥洛夫前向方程是一个具有著名矩阵指数解的微分方程组。然而,不幸的是,对于连续时间、时间马尔可夫过程,前向方程不存在解析解,因此研究人员和从业者通常做出简化假设,即该过程是分段时间齐次的。在本文中,我们给出了在违反分段齐次假设这种更现实的情况下,参数估计可能产生的潜在偏差的直观解释和示例,并主张以真正时间非齐次的方式进行似然计算的解决方案。特别关注了允许状态标签错误分类的多状态马尔可夫模型的背景,这更广泛地适用于隐马尔可夫模型(HMM),并且贝叶斯计算绕过了为获得最大似然估计(MLE)而进行计算量大的数值梯度近似的必要性。补充材料可在线获取。