Department of Psychosomatic Medicine, Center for Internal Medicine and Dermatology, Charité Universitätsmedizin Berlin, Charitéplatz 1, 10117 Berlin, Germany; Department for Psychotherapy and Biopsychosocial Health, Danube University Krems, Dr.-Karl-Dorrek-Straße 30, 3500 Krems, Austria.
Department of Psychosomatic Medicine and Psychotherapy, University Medical Center Hamburg-Eppendorf & Schön Klinik Hamburg Eilbek, Martinistraße 52, 20246 Hamburg & Dehnhaide 120, 22081 Hamburg, Germany.
J Clin Epidemiol. 2016 Mar;71:25-34. doi: 10.1016/j.jclinepi.2015.10.006. Epub 2015 Oct 22.
To investigate the validity of a common depression metric in independent samples.
We applied a common metrics approach based on item-response theory for measuring depression to four German-speaking samples that completed the Patient Health Questionnaire (PHQ-9). We compared the PHQ item parameters reported for this common metric to reestimated item parameters that derived from fitting a generalized partial credit model solely to the PHQ-9 items. We calibrated the new model on the same scale as the common metric using two approaches (estimation with shifted prior and Stocking-Lord linking). By fitting a mixed-effects model and using Bland-Altman plots, we investigated the agreement between latent depression scores resulting from the different estimation models.
We found different item parameters across samples and estimation methods. Although differences in latent depression scores between different estimation methods were statistically significant, these were clinically irrelevant.
Our findings provide evidence that it is possible to estimate latent depression scores by using the item parameters from a common metric instead of reestimating and linking a model. The use of common metric parameters is simple, for example, using a Web application (http://www.common-metrics.org) and offers a long-term perspective to improve the comparability of patient-reported outcome measures.
研究一种常见的抑郁度量标准在独立样本中的有效性。
我们应用了一种基于项目反应理论的常见度量方法来测量四个使用 PHQ-9 完成的德语样本中的抑郁程度。我们将报告的这种通用度量标准的 PHQ 项目参数与仅根据 PHQ-9 项目拟合广义部分信用模型重新估计的项目参数进行了比较。我们使用两种方法(带有转移先验的估计和 Stocking-Lord 链接)在相同的标度上对新模型进行校准。通过拟合混合效应模型和使用 Bland-Altman 图,我们研究了来自不同估计模型的潜在抑郁得分之间的一致性。
我们发现不同的样本和估计方法存在不同的项目参数。尽管不同估计方法之间的潜在抑郁得分差异具有统计学意义,但这些差异在临床上并不重要。
我们的研究结果表明,使用通用度量标准的项目参数来估计潜在的抑郁得分是可能的,而无需重新估计和链接模型。使用通用度量参数很简单,例如,使用 Web 应用程序(http://www.common-metrics.org),并为改善患者报告的结果测量的可比性提供了长期的视角。