探索拉施模型在多站点和多年评估的“通用内容”项目中的应用。

Exploring the use of Rasch modelling in "common content" items for multi-site and multi-year assessment.

作者信息

Hope David, Kluth David, Homer Matthew, Dewar Avril, Goddard-Fuller Rikki, Jaap Alan, Cameron Helen

机构信息

Medical Education Unit, The Chancellor's Building, College of Medicine and Veterinary Medicine, The University of Edinburgh, 49 Little France Crescent, Edinburgh, Scotland, EH16 4SB, UK.

Leeds Institute of Medical Education, Leeds School of Medicine, Worsley Building, University of Leeds, Woodhouse, Leeds, LS2 9JT, UK.

出版信息

Adv Health Sci Educ Theory Pract. 2025 Apr;30(2):427-438. doi: 10.1007/s10459-024-10354-y. Epub 2024 Jul 8.

DOI:10.1007/s10459-024-10354-y

PMID:38977526

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11965148/

Abstract

Rasch modelling is a powerful tool for evaluating item performance, measuring drift in difficulty over time, and comparing students who sat assessments at different times or at different sites. Here, we use data from thirty UK medical schools to describe the benefits of Rasch modelling in quality assurance and the barriers to using it. Sixty "common content" multiple choice items were offered to all UK medical schools in 2016-17, and a further sixty in 2017-18, with five available in both years. Thirty medical schools participated, for sixty total datasets across two sessions, and 14,342 individual sittings. Schools selected items to embed in written assessment near the end of their programmes. We applied Rasch modelling to evaluate unidimensionality, model fit statistics and item quality, horizontal equating to compare performance across schools, and vertical equating to compare item performance across time. Of the sixty sittings, three provided non-unidimensional data, and eight violated goodness of fit measures. Item-level statistics identified potential improvements in item construction and provided quality assurance. Horizontal equating demonstrated large differences in scores across schools, while vertical equating showed item characteristics were stable across sessions. Rasch modelling provides significant advantages in model- and item- level reporting compared to classical approaches. However, the complexity of the analysis and the smaller number of educators familiar with Rasch must be addressed locally for a programme to benefit. Furthermore, due to the comparative novelty of Rasch modelling, there is greater ambiguity on how to proceed when a Rasch model identifies misfitting or problematic data.

摘要

拉施模型是评估试题表现、衡量难度随时间变化的漂移情况以及比较在不同时间或不同地点参加评估的学生的有力工具。在此，我们使用来自英国30所医学院的数据来描述拉施模型在质量保证方面的益处以及使用该模型的障碍。2016 - 2017年向所有英国医学院提供了60道“通用内容”多项选择题，2017 - 2018年又提供了另外60道，其中有5道在两年中都有。30所医学院参与其中，在两个阶段共产生60个数据集，以及14342次个人考试。学校在其课程接近尾声时选择试题嵌入书面评估中。我们应用拉施模型来评估单维性、模型拟合统计量和试题质量，通过水平等值来比较各学校间的表现，通过垂直等值来比较不同时间的试题表现。在这60次考试中，有3次提供了非单维数据，8次违反了拟合优度指标。试题层面的统计数据指出了试题编制方面潜在的改进之处，并提供了质量保证。水平等值显示各学校间的分数存在很大差异，而垂直等值表明各阶段的试题特征是稳定的。与传统方法相比时，拉施模型在模型和试题层面的报告方面具有显著优势。然而，分析的复杂性以及熟悉拉施模型的教育工作者数量较少，必须在当地加以解决，以便一个项目能够从中受益。此外，由于拉施模型相对新颖，当拉施模型识别出不拟合或有问题的数据时，对于如何继续进行存在更大的模糊性。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

探索拉施模型在多站点和多年评估的“通用内容”项目中的应用。

Exploring the use of Rasch modelling in "common content" items for multi-site and multi-year assessment.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

探索拉施模型在多站点和多年评估的“通用内容”项目中的应用。

Exploring the use of Rasch modelling in "common content" items for multi-site and multi-year assessment.

作者信息

机构信息

出版信息

相似文献

本文引用的文献