Suppr超能文献

两种计算机自适应测验评估教学反应的进展监测决策规则的准确性。

Accuracy of progress monitoring decision rules to evaluate response to instruction with two computer adaptive tests.

机构信息

Center for Promoting Research to Practice, Lehigh University, United States of America.

Center for Promoting Research to Practice, Lehigh University, United States of America.

出版信息

J Sch Psychol. 2024 Aug;105:101319. doi: 10.1016/j.jsp.2024.101319. Epub 2024 May 14.

Abstract

Computer adaptive tests have become popular assessments to screen students for academic risk. Research is emerging regarding their use as progress monitoring tools to measure response to instruction. We evaluated the accuracy of the trend-line decision rule when applied to outcomes from a frequently used reading computer adaptive test (i.e., Star Reading [SR]) and frequently used math computer adaptive test (i.e., Star Math [SM]). Analyses of extant SR and SM data were conducted to inform conditions for simulations to determine the number of assessments required to yield sufficient sensitivity (i.e., probability of recommending an instructional change when a change was warranted) and specificity (i.e., probability of recommending maintaining an intervention when a change was not warranted) when comparing performance to goal lines based upon a future target score (i.e., benchmark) as well as normative comparisons (50th and 75th percentiles). The extant dataset of SR outcomes consisted of monthly progress monitoring data from 993 Grade 3, 804 Grade 4, and 709 Grade 5 students from multiple states in the United States northwest. Data for SM were also drawn from the northwest and contained outcomes from 518 Grade 3, 474 Grade 4, and 391 Grade 5 students. Grade level samples were predominately White (range = 59.89%-67.72%) followed by Latinx (range = 9.65%-15.94%). Results of simulations suggest that when data were collected once a month, seven, eight, and nine observations were required to support low-stakes decisions with SR for Grades 3, 4, and 5, respectively. For SM, nine, ten, and eight observations were required for Grades, 3, 4, and 5, respectively. Given the length of time required to support reasonably accurate decisions, recommendations to consider other types of assessments and decision-making frameworks for academic progress monitoring are provided.

摘要

计算机自适应测验已成为筛选学生学业风险的流行评估方式。关于将其用作衡量教学反应的进展监测工具的研究正在出现。我们评估了趋势线决策规则在经常使用的阅读计算机自适应测验(即 Star Reading [SR])和经常使用的数学计算机自适应测验(即 Star Math [SM])的结果中应用的准确性。对现有的 SR 和 SM 数据进行分析,为模拟提供信息,以确定在基于未来目标分数(即基准)比较表现与目标线(以及规范比较,即第 50 个和第 75 个百分位数)时,需要进行多少次评估以获得足够的灵敏度(即当需要改变时推荐改变教学的概率)和特异性(即当不需要改变时推荐维持干预的概率)。现有的 SR 结果数据集由来自美国西北部多个州的 993 名 3 年级、804 名 4 年级和 709 名 5 年级学生的每月进展监测数据组成。SM 的数据也来自西北部,包含 518 名 3 年级、474 名 4 年级和 391 名 5 年级学生的成绩。年级样本主要为白人(范围为 59.89%-67.72%),其次是拉丁裔(范围为 9.65%-15.94%)。模拟结果表明,当每月收集一次数据时,分别需要 7、8 和 9 次观察结果来支持 3、4 和 5 年级的低风险决策。对于 SM,3、4 和 5 年级分别需要 9、10 和 8 次观察结果。鉴于支持合理准确决策所需的时间长度,建议考虑其他类型的评估和学术进展监测的决策框架。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验