Mannan Haider R, Koval John J
Health Indicators, Canadian Institute for Health Information, Toronto, Ontario, Canada.
Stat Methods Med Res. 2003 Mar;12(2):125-46. doi: 10.1191/0962280203sm323ra.
It has been established that measures and reports of smoking behaviours are subject to substantial measurement errors. Thus, the manifest Markov model which does not consider measurement error in observed responses may not be adequate to mathematically model changes in adolescent smoking behaviour over time. For this purpose we fit several Mixed Markov Latent Class (MMLC) models using data sets from two longitudinal panel studies--the third Waterloo Smoking Prevention study and the UWO smoking study, which have varying numbers of measurements on adolescent smoking behaviour. However, the conventional statistics used for testing goodness of fit of these models do not follow the theoretical chi-square distribution when there is data sparsity. The two data sets analysed had varying degrees of sparsity. This problem can be solved by estimating the proper distribution of fit measures using Monte Carlo bootstrap simulation. In this study, we showed that incorporating response uncertainty in smoking behaviour significantly improved the fit of a single Markov chain model. However, the single chain latent Markov model did not adequately fit the two data sets indicating that the smoking process was heterogeneous with regard to latent Markov chains. It was found that a higher percentage of students (except for never smokers) changed their smoking behaviours over time at the manifest level compared to the latent or true level. The smoking process generally accelerated with time. The students had a tendency to underreport their smoking behaviours while response uncertainty was estimated to be considerably less for the Waterloo smoking study which adopted the 'bogus pipeline' method for reducing measurement error while the UWO study did not. For the two-chain latent mixed Markov models, incorporating a 'stayer' chain to an unrestricted Markov chain led to a significant improvement in model fit for the UWO study only. For both data sets, the assumption for the existence of an independence chain did not lead to significant improvement in model fit. The unrestricted two-chain latent mixed Markov model led to a significant improvement of model fit compared to a simple latent Markov model, but this model was overparameterized when the latent transition probabilities and/or response probabilities were assumed nonstationary. For the other models, the manifest/latent transition probabilities and response probabilities (for the four-wave Waterloo study only) were tested to be nonstationary for both data sets.
已确定吸烟行为的测量方法和报告存在大量测量误差。因此,未考虑观测反应中测量误差的显式马尔可夫模型可能不足以对青少年吸烟行为随时间的变化进行数学建模。为此,我们使用来自两项纵向面板研究的数据集拟合了多个混合马尔可夫潜类别(MMLC)模型——第三次滑铁卢吸烟预防研究和西安大略大学吸烟研究,这两项研究对青少年吸烟行为的测量次数各不相同。然而,当存在数据稀疏性时,用于检验这些模型拟合优度的传统统计量并不遵循理论卡方分布。所分析的两个数据集具有不同程度的稀疏性。这个问题可以通过使用蒙特卡罗自助模拟估计拟合度量的适当分布来解决。在本研究中,我们表明将吸烟行为中的反应不确定性纳入其中显著改善了单马尔可夫链模型的拟合。然而,单链潜马尔可夫模型并不能很好地拟合这两个数据集,这表明吸烟过程在潜马尔可夫链方面是异质的。结果发现,与潜水平或真实水平相比,在显式水平上有更高比例的学生(从不吸烟者除外)随时间改变了他们的吸烟行为。吸烟过程通常随时间加速。学生们倾向于少报他们的吸烟行为,而对于采用“伪管道”方法减少测量误差的滑铁卢吸烟研究,估计其反应不确定性要小得多,而西安大略大学的研究则没有采用这种方法。对于双链潜混合马尔可夫模型,仅在西安大略大学的研究中发现,将一个“停留者”链纳入无限制马尔可夫链会导致模型拟合有显著改善。对于这两个数据集,独立链存在的假设并没有导致模型拟合有显著改善。与简单潜马尔可夫模型相比,无限制双链潜混合马尔可夫模型导致模型拟合有显著改善,但当假设潜转移概率和/或反应概率非平稳时,该模型参数过多。对于其他模型,对这两个数据集的显式/潜式转移概率和反应概率(仅针对四波滑铁卢研究)进行检验,结果表明它们是非平稳的。