Smith R M, Gross L J
Rehabilitation Foundation Inc., Wheaton, IL 60189, USA.
J Outcome Meas. 1997;1(2):164-72.
It is often impossible to validate cut scores set using judged item review methods due to the fact that many high stakes testing programs attempt to limit the number of common items across consecutively administered forms. However, over time, with a stable item pool, secondary links through other test administrations allow the use of common item equating to test the stability of the judged cut scores. In this study five forms of a basic science examination administered over a three year period in a national board testing program were analyzed to determine the stability of judged cut scores. The stability was determined by comparison of the judged cut scores with the equated cut scores derived by the Rasch common item equating technique. The results indicate cut scores derived from the modified Nedelsky procedure were within equating error of the Rasch equated cut scores over five administrations.
由于许多高风险测试项目试图限制连续施测的不同试卷中共同题目的数量,因此使用判断性题目审查方法设定的分数线往往无法得到验证。然而,随着时间的推移,在项目库稳定的情况下,通过其他测试施测建立的二级链接允许使用共同题目等值法来检验判断性分数线的稳定性。在本研究中,分析了在一项国家委员会测试项目中,三年内施测的五种基础科学考试试卷,以确定判断性分数线的稳定性。通过将判断性分数线与采用拉施共同题目等值技术得出的等值分数线进行比较,来确定稳定性。结果表明,在五次施测中,采用修正后的内德尔斯基程序得出的分数线在拉施等值分数线的等值误差范围内。