Kanzow P, Schuelper N, Witt D, Wassmann T, Sennhenn-Kirchner S, Wiegand A, Raupach T
Department of Preventive Dentistry, Periodontology and Cariology, University Medical Center Goettingen, Goettingen, Germany.
Department of Haematology and Medical Oncology, University Medical Center Goettingen, Goettingen, Germany.
Eur J Dent Educ. 2018 Nov;22(4):e669-e678. doi: 10.1111/eje.12372. Epub 2018 Jun 22.
Various scoring approaches for Multiple True-False (MTF) items exist. This study aimed at comparing scoring results obtained with different scoring approaches and to assess the effect of item cues on each scoring approaches' result.
Different scoring approaches (MTF, Count-2, Count-3, "Vorkauf-Method," PS , Dichotomized MTF, "Blasberg-Method," Multiple response (MR), Correction for Guessing, "Ripkey-Method," Morgan-Method, Balanced Scoring Method) were retrospectively applied to all MTF items used within electronic examinations of undergraduate dental students at the University Medical Center Göttingen in the winter term 2016/2017 (1297 marking events). Item quality was evaluated regarding formal parameters such as presence of cues and correctness of content. Differences between scoring results of all scoring approaches and the differences between each methods' scoring results of items with and without cues were calculated by Wilcoxon rank sum tests (P < .05).
Average scoring results per item highly differed between the scoring approaches and ranged from 0.46 (MR) to 0.92 (Dichotomized MTF). Presence of cues leads to significantly higher scoring in case of all scoring approaches (P < .001; +0.14 on average). However, effect of cues differed amongst scoring approaches and ranged from +0.04 (Dichotomized MTF) to +0.20 (MR).
Scoring of MTF items is complex. The data presented in this manuscript may help educators make informed choices about scoring algorithms.
针对多项正误(MTF)题型存在多种评分方法。本研究旨在比较不同评分方法所获得的评分结果,并评估题目线索对每种评分方法结果的影响。
对哥廷根大学医学中心2016/2017冬季学期本科牙科学生电子考试中使用的所有MTF题目,回顾性地应用不同的评分方法(MTF、Count-2、Count-3、“预购法”、PS、二分法MTF、“布拉斯贝格法”、多重反应(MR)、猜测校正、“里普基法”、摩根法、平衡评分法)(1297次评分事件)。从形式参数如线索的存在和内容的正确性方面评估题目质量。通过Wilcoxon秩和检验计算所有评分方法评分结果之间的差异,以及有线索和无线索题目的每种方法评分结果之间的差异(P < 0.05)。
各评分方法之间每题的平均评分结果差异很大,范围从0.46(MR)到0.92(二分法MTF)。在所有评分方法中,线索的存在导致得分显著更高(P < 0.001;平均提高0.14)。然而,线索的影响在评分方法之间有所不同,范围从+0.04(二分法MTF)到+0.20(MR)。
MTF题目的评分很复杂。本手稿中呈现的数据可能有助于教育工作者对评分算法做出明智的选择。