Simons Jan-Willem, Boverhof Bart-Jan, Aarts Emmeke
Department of Sociology, Utrecht University, Utrecht, The Netherlands.
Erasmus School of Health Policy & Management, Erasmus University Rotterdam, Rotterdam, The Netherlands.
PLoS One. 2024 Dec 11;19(12):e0314444. doi: 10.1371/journal.pone.0314444. eCollection 2024.
The hidden Markov model is a popular modeling strategy for describing and explaining latent process dynamics. There is a lack of information on the estimation performance of the Bayesian hidden Markov model when applied to categorical, one-level data. We conducted a simulation study to assess the effect of the 1) number of observations (250-8.000), 2) number of levels in the categorical outcome variable (3-7), and 3) state distinctiveness and state separation in the emission distribution (low, medium, high) on the performance of the Bayesian hidden Markov model. Performance is quantified in terms of convergence, accuracy, precision, and coverage. Convergence is generally achieved throughout. Accuracy, precision, and coverage increase with a higher number of observations and an increased level of state distinctiveness, and to a lesser extent with an increased level of state separation. The number of categorical levels only marginally influences performance. A minimum of 1.000 observations is recommended to ensure adequate model performance.
隐马尔可夫模型是一种用于描述和解释潜在过程动态的流行建模策略。在应用于分类单水平数据时,关于贝叶斯隐马尔可夫模型的估计性能的信息较少。我们进行了一项模拟研究,以评估1)观察次数(250 - 8000)、2)分类结果变量中的水平数(3 - 7)以及3)发射分布中的状态独特性和状态分离度(低、中、高)对贝叶斯隐马尔可夫模型性能的影响。性能通过收敛性、准确性、精确性和覆盖率来量化。总体上在整个过程中都能实现收敛。准确性、精确性和覆盖率随着观察次数的增加和状态独特性水平的提高而增加,并且在较小程度上随着状态分离度水平的提高而增加。分类水平的数量对性能的影响很小。建议至少有1000次观察以确保模型有足够的性能。