Traynor Anne, Li Cheng-Hsien, Zhou Shuqi
Purdue University, West Lafayette, IN, USA.
National Sun Yat-sen University, Kaohsiung, Taiwan.
Educ Psychol Meas. 2025 Jan 30:00131644241313212. doi: 10.1177/00131644241313212.
Inferences about student learning from large-scale achievement test scores are fundamental in education. For achievement test scores to provide useful information about student learning progress, differences in the content of instruction (i.e., the implemented curriculum) should affect test-takers' item responses. Existing research has begun to identify patterns in the content of instructionally sensitive multiple-choice achievement test items. To inform future test design decisions, this study identified instructionally (in)sensitive constructed-response achievement items, then characterized features of those items and their corresponding scoring rubrics. First, we used simulation to evaluate an item step difficulty difference index for constructed-response test items, derived from the generalized partial credit model. The statistical performance of the index was adequate, so we then applied it to data from 32 constructed-response eighth-grade science test items. We found that the instructional sensitivity (IS) index values varied appreciably across the category boundaries within an item as well as across items. Content analysis by master science teachers allowed us to identify general features of item score categories that show high, or negligible, IS.
从大规模成绩测试分数推断学生的学习情况在教育领域至关重要。为了使成绩测试分数能提供有关学生学习进展的有用信息,教学内容(即实施的课程)的差异应影响考生对试题的回答。现有研究已开始识别对教学敏感的多项选择题成绩测试题目的内容模式。为了为未来的测试设计决策提供参考,本研究识别了对教学(不)敏感的建构回应式成绩题目,然后描述了这些题目的特征及其相应的评分标准。首先,我们使用模拟来评估从广义部分计分模型推导出来的建构回应式测试题目的项目步长难度差异指数。该指数的统计性能良好,因此我们随后将其应用于32道八年级科学建构回应式测试题目的数据。我们发现,教学敏感度(IS)指数值在一个题目内的类别边界之间以及不同题目之间有明显差异。理科主考教师进行的内容分析使我们能够识别出显示高教学敏感度或可忽略不计的教学敏感度的题目分数类别的一般特征。