Department of Cognitive Sciences, University of California, Irvine.
Department of Language Science, University of California, Irvine.
J Speech Lang Hear Res. 2022 Jan 12;65(1):215-237. doi: 10.1044/2021_JSLHR-20-00205. Epub 2021 Nov 24.
Meaningful changes in picture naming responses may be obscured when measuring accuracy instead of quality. A statistic that incorporates information about the severity and nature of impairments may be more sensitive to the effects of treatment.
We analyzed data from repeated administrations of a naming test to 72 participants with stroke aphasia in a clinical trial for anomia therapy. Participants were divided into two groups for analysis to demonstrate replicability. We assessed reliability among response type scores from five raters. We then derived four summary statistics of naming ability and their changes over time for each participant: (a) the standard accuracy measure, (b) an accuracy measure adjusted for item difficulty, (c) an accuracy measure adjusted for item difficulty for specific response types, and (d) a distance measure adjusted for item difficulty for specific response types. While accuracy measures address the likelihood of a correct response, the distance measure reflects that different response types range in their similarity to the target. Model fit was assessed. The frequency of significant improvements and the average magnitude of improvements for each summary statistic were compared between treatment groups and a control group. Effect sizes for each model-based statistic were compared with the effect size for the standard accuracy measure.
Interrater and intrarater reliability were near perfect, on average, though compromised somewhat by phonological-level errors. The effects of treatment were more evident, in terms of both frequency and magnitude, when using the distance measure versus the other accuracy statistics.
Consideration of item difficulty and response types revealed additional effects of treatment on naming scores beyond those observed for the standard accuracy measure. The results support theories that assume naming ability is decomposable into subabilities rather than being monolithic, suggesting new opportunities for measuring treatment outcomes.
在衡量准确性而不是质量时,图片命名反应中的有意义变化可能会被掩盖。纳入关于损伤严重程度和性质信息的统计数据可能更能敏感地反映治疗效果。
我们分析了一项命名治疗临床试验中 72 名中风失语症患者重复进行命名测试的数据。参与者被分为两组进行分析,以证明可重复性。我们评估了五位评分者对反应类型得分的可靠性。然后,我们为每个参与者得出了命名能力的四个综合统计数据及其随时间的变化:(a)标准准确性衡量标准,(b)针对项目难度调整的准确性衡量标准,(c)针对特定反应类型的项目难度调整的准确性衡量标准,以及(d)针对特定反应类型的项目难度调整的距离衡量标准。虽然准确性衡量标准涉及正确反应的可能性,但距离衡量标准反映了不同反应类型与目标的相似程度。评估了模型拟合情况。比较了治疗组和对照组之间每个综合统计数据的显著改善频率和平均改善幅度。将每个基于模型的统计数据的效应量与标准准确性衡量标准的效应量进行了比较。
平均而言,评分者间和评分者内的可靠性接近完美,但在一定程度上受到了语音水平错误的影响。与其他准确性统计数据相比,使用距离衡量标准时,治疗效果在频率和幅度方面更为明显。
考虑项目难度和反应类型揭示了命名分数的治疗效果,这些效果超出了标准准确性衡量标准所观察到的效果。结果支持了假设命名能力可分解为子能力而不是单一能力的理论,为衡量治疗结果提供了新的机会。