运用 Rasch 测量对解剖学课程考试进行评分、评估和改进。

Using Rasch measurement to score, evaluate, and improve examinations in an anatomy course.

机构信息

Office of Medical Education, University of North Carolina, Chapel Hill School of Medicine, Chapel Hill, North Carolina; Department of Family Medicine, University of North Carolina at Chapel Hill School of Medicine, Chapel Hill, North Carolina.

出版信息

Anat Sci Educ. 2014 Nov-Dec;7(6):450-60. doi: 10.1002/ase.1436. Epub 2014 Jan 15.

DOI:10.1002/ase.1436

PMID:24431324

Abstract

Any examination that involves moderate to high stakes implications for examinees should be psychometrically sound and legally defensible. Currently, there are two broad and competing families of test theories that are used to score examination data. The majority of instructors outside the high-stakes testing arena rely on classical test theory (CTT) methods. However, advances in item response theory software have made the application of these techniques much more accessible to classroom instructors. The purpose of this research is to analyze a common medical school anatomy examination using both the traditional CTT scoring method and a Rasch measurement scoring method to determine which technique provides more robust findings, and which set of psychometric indicators will be more meaningful and useful for anatomists looking to improve the psychometric quality and functioning of their examinations. Results produced by the more robust and meaningful methodology will undergo a rigorous psychometric validation process to evaluate construct validity. Implications of these techniques and additional possibilities for advanced applications are also discussed.

摘要

任何对考生有中等到高度利害关系的考试都应该在心理测量学上是可靠的，在法律上是站得住脚的。目前，有两种广泛的、相互竞争的测试理论家族被用于评分考试数据。在高风险测试领域之外的大多数教师依赖经典测试理论（CTT）方法。然而，项目反应理论软件的进步使得这些技术的应用对课堂教师来说更加容易。本研究的目的是使用传统的 CTT 评分方法和 Rasch 测量评分方法分析一个常见的医学院解剖考试，以确定哪种技术提供更可靠的结果，以及对于希望提高考试的心理测量质量和功能的解剖学家来说，哪一组心理测量指标将更有意义和有用。通过更强大和更有意义的方法学产生的结果将经过严格的心理测量验证过程，以评估构念效度。还讨论了这些技术的影响以及高级应用的其他可能性。