• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

复杂评估决策一致性指标的估计:基于模型的方法。

Estimation of decision consistency indices for complex assessments: model based approaches.

作者信息

Stearns Matthew, Smith Richard M

机构信息

Psychometric Services, Data Recognition Corporation, 13490 Bass Lake Road, Maple Grove, MN 55311, USA.

出版信息

J Appl Meas. 2008;9(3):305-15.

PMID:18753697
Abstract

With the implementation of the No Child Left Behind assessment program and the use of proficiency levels as a means of evaluating Annual Yearly Progress, there is a renewed interest in the consistency of classification decisions based on scale scores from achievement test and state-wide proficiency standards. Many of the current methods described in the literature (Huynh, 1976; Hanson and Brennan, 1990; and Livingston and Lewis, 1995) are based on assumptions about the distribution of the conditional errors. Although recent methods (Brennan and Wan, 2004) make no assumptions about the distribution, these methods have one compelling disadvantage: the decision consistency calculated is based on the entire set of data and are not conditional on the location of the cut scores, the student measure and the conditional standard errors of measurement for the students. The decision consistency for a student scoring right at the cut score will be much lower that the decision consistency for a student with a score 5 points above or below that cut score. The standard error method described in this article is based solely on the asymptotic standard error of measurement derived from the appropriate Rasch measurement model, and the location of the cut score used to make the classification decision. This classification can be easily modified to accommodate multiple classification categories. This is a conditional decision consistency statistic that can be applied to each person ability estimate (raw score) and provides information that can be used to calculate the likelihood that a person with that measure will receive the same classification if retested. The decision consistency for the entire sample can be calculated by simply summing the likelihood of the same classification over all of the examinees. The results of retest simulations using data that fit the Rasch model suggest that the standard error method provides a better estimate of the resulting classification consistency than the true score methods or the bootstrap method.

摘要

随着“不让一个孩子掉队”评估计划的实施以及将熟练水平作为评估年度进展的一种方式,人们重新关注基于成就测试量表分数和全州熟练标准的分类决策的一致性。文献中描述的许多当前方法(Huynh,1976;Hanson和Brennan,1990;Livingston和Lewis,1995)都是基于关于条件误差分布的假设。尽管最近的方法(Brennan和Wan,2004)对分布不做假设,但这些方法有一个明显的缺点:计算出的决策一致性是基于整个数据集,而不是以切点位置、学生测量值以及学生测量的条件标准误差为条件。在切点处得分的学生的决策一致性将远低于在该切点之上或之下5分的学生的决策一致性。本文描述的标准误差方法仅基于从适当的拉施测量模型得出的渐近测量标准误差以及用于做出分类决策的切点位置。这种分类可以很容易地修改以适应多个分类类别。这是一个条件决策一致性统计量,可以应用于每个人的能力估计(原始分数),并提供可用于计算具有该测量值的人如果重新测试将获得相同分类的可能性的信息。整个样本的决策一致性可以通过简单地将所有考生相同分类的可能性相加来计算。使用符合拉施模型的数据进行的重新测试模拟结果表明,与真分数方法或自助法相比,标准误差方法能更好地估计最终的分类一致性。

相似文献

1
Estimation of decision consistency indices for complex assessments: model based approaches.复杂评估决策一致性指标的估计:基于模型的方法。
J Appl Meas. 2008;9(3):305-15.
2
Rasch fit statistics as a test of the invariance of item parameter estimates.拉施拟合统计作为项目参数估计不变性的一种检验。
J Appl Meas. 2003;4(2):153-63.
3
Who will pass the dental OSCE? Comparison of the Angoff and the borderline regression standard setting methods.谁将通过牙科客观结构化临床考试?安格夫法与边界回归标准设定方法的比较。
Eur J Dent Educ. 2009 Aug;13(3):162-71. doi: 10.1111/j.1600-0579.2008.00568.x.
4
Comparing holistic and analytic scoring for performance assessment with many-facet Rasch model.运用多面Rasch模型比较整体评分与分析评分在绩效评估中的应用
J Appl Meas. 2001;2(4):379-88.
5
An introduction to multidimensional measurement using Rasch models.使用拉施模型的多维测量介绍。
J Appl Meas. 2003;4(1):87-100.
6
Reliability: on the reproducibility of assessment data.可靠性:关于评估数据的可重复性。
Med Educ. 2004 Sep;38(9):1006-12. doi: 10.1111/j.1365-2929.2004.01932.x.
7
Standard setting with dichotomous and constructed response items: some Rasch model approaches.使用二分法和结构化反应题目的标准设定:一些拉施模型方法。
J Appl Meas. 2009;10(4):438-54.
8
A comparative analysis of the ratings in performance assessment using generalizability theory and the many-facet Rasch model.使用概化理论和多面Rasch模型对绩效评估中的评分进行比较分析。
J Appl Meas. 2009;10(4):408-23.
9
Setting performance standards for mannequin-based acute-care scenarios: an examinee-centered approach.为基于人体模型的急性护理场景设定性能标准:以考生为中心的方法。
Simul Healthc. 2008 Summer;3(2):72-81. doi: 10.1097/SIH.0b013e31816e39e2.
10
Developing examinations that use equal raw scores for cut scores.开发将相同原始分数用作及格分数的考试。
J Appl Meas. 2010;11(4):432-42.