RTI International, Research Triangle Park, North Carolina, USA.
Center for Alcohol Studies, Rutgers University-Piscataway, Piscataway, New Jersey, USA.
J Trauma Stress. 2022 Jun;35(3):926-940. doi: 10.1002/jts.22800. Epub 2022 Feb 5.
Multiple factor analytic and item response theory studies have shown that items/symptoms vary in their relative clinical weights in structured interview measures for posttraumatic stress disorder (PTSD). Despite these findings, the use of total scores, which treat symptoms as though they are equally weighted, predominates in practice, with the consequence of undermining the precision of clinical decision-making. We conducted an integrative data analysis (IDA) study to harmonize PTSD structured interview data (i.e., recoding of items to a common symptom metric) from 25 studies (total N = 2,568). We aimed to identify (a) measurement noninvariance/differential item functioning (MNI/DIF) across multiple populations, psychiatric comorbidities, and interview measures simultaneously and (b) differences in inferences regarding underlying PTSD severity between scale scores estimated using moderated nonlinear factor analysis (MNLFA) and a total score analog model (TSA). Several predictors of MNI/DIF impacted effect size differences in underlying severity across scale scoring methods. Notably, we observed MNI/DIF substantial enough to bias inferences on underlying PTSD severity for two groups: African Americans and incarcerated women. The findings highlight two issues raised elsewhere in the PTSD psychometrics literature: (a) bias in characterizing underlying PTSD severity and individual-level treatment outcomes when the psychometric model underlying total scores fails to fit the data and (b) higher latent severity scores, on average, when using DSM-5 (net of MNI/DIF) criteria, by which multiple factors (e.g., Criterion A discordance across DSM editions, changes to the number/type of symptom clusters, changes to the symptoms themselves) may have impacted severity scoring for some patients.
多项因素分析和项目反应理论研究表明,在创伤后应激障碍(PTSD)的结构化访谈测量中,项目/症状在其相对临床权重上存在差异。尽管有这些发现,但在实践中,仍普遍使用总分,将症状视为同等加权,这导致了临床决策的准确性受到损害。我们进行了一项综合数据分析(IDA)研究,以协调 25 项研究(总 N=2568)的 PTSD 结构化访谈数据(即,将项目重新编码为共同症状指标)。我们旨在确定:(a)在多个群体、精神共病和访谈测量中同时识别测量非不变性/差异项目功能(MNI/DIF);(b)使用 Moderated Nonlinear Factor Analysis(MNLFA)和总分模拟模型(TSA)估计量表分数时,对潜在 PTSD 严重程度的推断差异。MNI/DIF 的几个预测因素影响了跨量表评分方法的潜在严重程度的效应大小差异。值得注意的是,我们观察到 MNI/DIF 足以影响两个群体(非裔美国人和被监禁的妇女)潜在 PTSD 严重程度的推断。这些发现强调了 PTSD 心理计量学文献中其他地方提出的两个问题:(a)当总分背后的心理计量模型不能拟合数据时,对潜在 PTSD 严重程度和个体水平治疗结果的描述存在偏差;(b)使用 DSM-5(扣除 MNI/DIF)标准时,平均而言,潜在严重程度得分更高,其中多个因素(例如,DSM 版本之间的标准 A 不相符、症状群数量/类型的变化、症状本身的变化)可能会影响一些患者的严重程度评分。