Suppr超能文献

复杂评估决策一致性指标的估计:基于模型的方法。

Estimation of decision consistency indices for complex assessments: model based approaches.

作者信息

Stearns Matthew, Smith Richard M

机构信息

Psychometric Services, Data Recognition Corporation, 13490 Bass Lake Road, Maple Grove, MN 55311, USA.

出版信息

J Appl Meas. 2008;9(3):305-15.

Abstract

With the implementation of the No Child Left Behind assessment program and the use of proficiency levels as a means of evaluating Annual Yearly Progress, there is a renewed interest in the consistency of classification decisions based on scale scores from achievement test and state-wide proficiency standards. Many of the current methods described in the literature (Huynh, 1976; Hanson and Brennan, 1990; and Livingston and Lewis, 1995) are based on assumptions about the distribution of the conditional errors. Although recent methods (Brennan and Wan, 2004) make no assumptions about the distribution, these methods have one compelling disadvantage: the decision consistency calculated is based on the entire set of data and are not conditional on the location of the cut scores, the student measure and the conditional standard errors of measurement for the students. The decision consistency for a student scoring right at the cut score will be much lower that the decision consistency for a student with a score 5 points above or below that cut score. The standard error method described in this article is based solely on the asymptotic standard error of measurement derived from the appropriate Rasch measurement model, and the location of the cut score used to make the classification decision. This classification can be easily modified to accommodate multiple classification categories. This is a conditional decision consistency statistic that can be applied to each person ability estimate (raw score) and provides information that can be used to calculate the likelihood that a person with that measure will receive the same classification if retested. The decision consistency for the entire sample can be calculated by simply summing the likelihood of the same classification over all of the examinees. The results of retest simulations using data that fit the Rasch model suggest that the standard error method provides a better estimate of the resulting classification consistency than the true score methods or the bootstrap method.

摘要

随着“不让一个孩子掉队”评估计划的实施以及将熟练水平作为评估年度进展的一种方式,人们重新关注基于成就测试量表分数和全州熟练标准的分类决策的一致性。文献中描述的许多当前方法(Huynh,1976;Hanson和Brennan,1990;Livingston和Lewis,1995)都是基于关于条件误差分布的假设。尽管最近的方法(Brennan和Wan,2004)对分布不做假设,但这些方法有一个明显的缺点:计算出的决策一致性是基于整个数据集,而不是以切点位置、学生测量值以及学生测量的条件标准误差为条件。在切点处得分的学生的决策一致性将远低于在该切点之上或之下5分的学生的决策一致性。本文描述的标准误差方法仅基于从适当的拉施测量模型得出的渐近测量标准误差以及用于做出分类决策的切点位置。这种分类可以很容易地修改以适应多个分类类别。这是一个条件决策一致性统计量,可以应用于每个人的能力估计(原始分数),并提供可用于计算具有该测量值的人如果重新测试将获得相同分类的可能性的信息。整个样本的决策一致性可以通过简单地将所有考生相同分类的可能性相加来计算。使用符合拉施模型的数据进行的重新测试模拟结果表明,与真分数方法或自助法相比,标准误差方法能更好地估计最终的分类一致性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验