Suppr超能文献

使用改良的 Angoff、改良的 Ebel 和 Hofstee 标准设定方法对韩国医师执照考试的不同题量测试集进行切分分数的相似性比较。

Similarity of the cut score in test sets with different item amounts using the modified Angoff, modified Ebel, and Hofstee standard-setting methods for the Korean Medical Licensing Examination.

机构信息

Department of Medical Education, Soonchunhyang University College of Medicine, Asan, Korea.

Korea Health Personnel Licensing Examination Institute, Seoul, Korea.

出版信息

J Educ Eval Health Prof. 2020;17:28. doi: 10.3352/jeehp.2020.17.28. Epub 2020 Oct 5.

Abstract

PURPOSE

The Korea Medical Licensing Exam (KMLE) typically contains a large number of items. The purpose of this study was to investigate whether there is a difference in the cut score between evaluating all items of the exam and evaluating only some items when conducting standard-setting.

METHODS

We divided the item sets that appeared on 3 recent KMLEs for the past 3 years into 4 subsets of each year of 25% each based on their item content categories, discrimination index, and difficulty index. The entire panel of 15 members assessed all the items (360 items, 100%) of the year 2017. In split-half set 1, each item set contained 184 (51%) items of year 2018 and each set from split-half set 2 contained 182 (51%) items of the year 2019 using the same method. We used the modified Angoff, modified Ebel, and Hofstee methods in the standard-setting process.

RESULTS

Less than a 1% cut score difference was observed when the same method was used to stratify item subsets containing 25%, 51%, or 100% of the entire set. When rating fewer items, higher rater reliability was observed.

CONCLUSION

When the entire item set was divided into equivalent subsets, assessing the exam using a portion of the item set (90 out of 360 items) yielded similar cut scores to those derived using the entire item set. There was a higher correlation between panelists' individual assessments and the overall assessments.

摘要

目的

韩国医师执照考试(KMLE)通常包含大量题目。本研究旨在探讨在进行标准设定时,评估考试所有题目与仅评估部分题目之间的切分分数是否存在差异。

方法

我们将过去 3 年的 3 次最近的 KMLE 中的题目集,根据其题目内容类别、区分度指数和难度指数,分为 4 个每年各占 25%的子集。整个 15 名成员的小组评估了 2017 年所有的题目(360 道题,满分 100%)。在分半集 1 中,每个题目集包含 2018 年的 184 道题目(51%),每个来自分半集 2 的题目集包含 2019 年的 182 道题目(51%),采用相同的方法。在标准设定过程中,我们使用了改良的 Angoff、改良的 Ebel 和 Hofstee 方法。

结果

当使用相同的方法分层包含 25%、51%或 100%整个集合的题目子集时,观察到不到 1%的切分分数差异。当评估较少的题目时,观察到更高的评分者可靠性。

结论

当将整个题目集分为等效子集时,使用部分题目集(360 道题目中的 90 道)评估考试可产生与使用整个题目集相似的切分分数。小组成员的个别评估与总体评估之间的相关性更高。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验