Faculty of Health, Institute for Teaching and Educational Research in Health Sciences, Witten/Herdecke University, Germany.
Adv Health Sci Educ Theory Pract. 2011 May;16(2):211-21. doi: 10.1007/s10459-010-9256-1. Epub 2010 Oct 31.
To compare different scoring algorithms for Pick-N multiple correct answer multiple-choice (MC) exams regarding test reliability, student performance, total item discrimination and item difficulty. Data from six 3rd year medical students' end of term exams in internal medicine from 2005 to 2008 at Munich University were analysed (1,255 students, 180 Pick-N items in total). Scoring Algorithms: Each question scored a maximum of one point. We compared: (a) Dichotomous scoring (DS): One point if all true and no wrong answers were chosen. (b) Partial credit algorithm 1 (PS(50)): One point for 100% true answers; 0.5 points for 50% or more true answers; zero points for less than 50% true answers. No point deduction for wrong choices. (c) Partial credit algorithm 2 (PS(1/m)): A fraction of one point depending on the total number of true answers was given for each correct answer identified. No point deduction for wrong choices. Application of partial crediting resulted in psychometric results superior to dichotomous scoring (DS). Algorithms examined resulted in similar psychometric data with PS(50) only slightly exceeding PS(1/m) in higher coefficients of reliability. The Pick-N MC format and its scoring using the PS(50) and PS(1/m) algorithms are suited for undergraduate medical examinations. Partial knowledge should be awarded in Pick-N MC exams.
比较 Pick-N 多选题考试中不同评分算法的测试可靠性、学生表现、整体项目区分度和项目难度。分析了 2005 年至 2008 年慕尼黑大学六名三年级医学生内科期末考试的数据(1255 名学生,180 个 Pick-N 项目)。评分算法:每题最高得一分。我们比较了:(a)二分评分(DS):如果所有答案都正确且没有错误答案,则得一分。(b)部分计分算法 1(PS(50)):100%正确答案得一分;50%或以上正确答案得 0.5 分;少于 50%正确答案得零分。错误答案不扣分。(c)部分计分算法 2(PS(1/m)):根据正确答案的总数,为每个正确答案分配一个分数。错误答案不扣分。部分计分的应用结果优于二分评分(DS)。检查的算法得出了相似的心理测量数据,其中 PS(50)在可靠性较高的系数上略高于 PS(1/m)。Pick-N MC 格式及其使用 PS(50)和 PS(1/m)算法的评分适用于本科医学考试。在 Pick-N MC 考试中应授予部分知识。