Hula William, Doyle Patrick J, McNeil Malcolm R, Mikolic Joseph M
VA Pittsburgh Healthcare System and University of Pittsburgh, Pittsburgh, PA 15206, USA.
J Speech Lang Hear Res. 2006 Feb;49(1):27-46. doi: 10.1044/1092-4388(2006/003).
The purpose of this research was to examine the validity of the 55-item Revised Token Test (RTT) and to compare traditional and Rasch-based scores in their ability to detect group differences and change over time. The 55-item RTT was administered to 108 left- and right-hemisphere stroke survivors, and the data were submitted to Rasch analysis. Traditional and Rasch-based scores for a subsample of 60 stroke survivors were submitted to analyses of variance with group (left hemisphere with aphasia vs. right hemisphere) and time post onset (3 vs. 6 months post onset) as factors. The 2 scoring methods were compared using an index of relative precision. Forty-eight items demonstrated acceptable model fit. Misfitting items came primarily from Subtest IX. The Rasch model accounted for 71% of the variance in the responses to the remaining items. Intersubtest patterns of item difficulty were well predicted by item content, but unexpected within-subtest differences were found. Both traditional and Rasch person scores demonstrated significant group differences, but only the latter demonstrated statistically significant change over time. Analysis of relative precision, however, failed to confirm a significant difference between the 2 methods. The findings generally support the RTT's validity, but a minority of items appears to respond to a different construct. Also, within-subtest differences in item difficulty suggest the need for further examination of variability in impaired language performance. Finally, the results suggest an equivocal advantage for Rasch scores in detecting change over time.
本研究的目的是检验55项修订版代币测验(RTT)的效度,并比较传统分数和基于拉施模型的分数在检测组间差异及随时间变化方面的能力。对108名左、右半球中风幸存者施测了55项RTT,并将数据进行拉施分析。将60名中风幸存者子样本的传统分数和基于拉施模型的分数,以组别(伴有失语症的左半球组与右半球组)和发病后时间(发病后3个月与6个月)为因素进行方差分析。使用相对精度指数对两种计分方法进行比较。48个项目显示出可接受的模型拟合度。拟合不佳的项目主要来自第九分测验。拉施模型解释了其余项目反应中71%的方差。项目内容能很好地预测分测验间的项目难度模式,但在分测验内发现了意外的差异。传统分数和基于拉施模型的个人分数均显示出显著的组间差异,但只有后者显示出随时间的统计学显著变化。然而,相对精度分析未能证实两种方法之间存在显著差异。研究结果总体上支持RTT的效度,但少数项目似乎对不同的结构有反应。此外,分测验内项目难度的差异表明需要进一步检查受损语言表现的变异性。最后,结果表明在检测随时间的变化方面,拉施分数的优势并不明确。