Suppr超能文献

多项选择题评估中的剖析知识、猜测与失误

Dissecting knowledge, guessing, and blunder in multiple choice assessments.

作者信息

Abu-Ghazalah Rashid M, Dubins David N, Poon Gregory M K

机构信息

W. Booth School of Engineering Practice and Technology, Faculty of Engineering, McMaster University, Hamilton, Ontario, Canada.

Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, Ontario, Canada.

出版信息

Appl Meas Educ. 2023;36(1):80-98. doi: 10.1080/08957347.2023.2172017. Epub 2023 Feb 21.

Abstract

Multiple choice results are inherently probabilistic outcomes, as correct responses reflect a combination of knowledge and guessing, while incorrect responses additionally reflect blunder, a confidently committed mistake. To objectively resolve knowledge from responses in an MC test structure, we evaluated probabilistic models that explicitly account for guessing, knowledge and blunder using eight assessments (>9,000 responses) from an undergraduate biotechnology curriculum. A Bayesian implementation of the models, aimed at assessing their robustness to prior beliefs in examinee knowledge, showed that explicit estimators of knowledge are markedly sensitive to prior beliefs with scores as sole input. To overcome this limitation, we examined self-ranked confidence as a proxy knowledge indicator. For our test set, three levels of confidence resolved test performance. Responses rated as least confident were correct more frequently than expected from random selection, reflecting partial knowledge, but were balanced by blunder among the most confident responses. By translating evidence-based guessing and blunder rates to pass marks that statistically qualify a desired level of examinee knowledge, our approach finds practical utility in test analysis and design.

摘要

多项选择题的结果本质上是概率性的,因为正确答案反映了知识和猜测的结合,而错误答案还反映了失误,即一种自信地犯下的错误。为了在多项选择题测试结构中从回答中客观地分辨出知识,我们使用了本科生物技术课程中的八项评估(超过9000个回答),对明确考虑猜测、知识和失误的概率模型进行了评估。这些模型的贝叶斯实现旨在评估它们对考生知识先验信念的稳健性,结果表明,以分数作为唯一输入时,知识的明确估计对先验信念非常敏感。为了克服这一局限性,我们将自我排序的信心作为知识指标的替代。对于我们的测试集,三个信心水平解析了测试表现。被评为最不自信的回答正确的频率高于随机选择的预期,这反映了部分知识,但在最自信的回答中失误起到了平衡作用。通过将基于证据的猜测和失误率转化为统计学上符合所需考生知识水平的及格分数,我们的方法在测试分析和设计中具有实际用途。

相似文献

1
Dissecting knowledge, guessing, and blunder in multiple choice assessments.
Appl Meas Educ. 2023;36(1):80-98. doi: 10.1080/08957347.2023.2172017. Epub 2023 Feb 21.
3
The incorrect response in multiple-choice examinations.
S Afr Med J. 1981 Oct 10;60(15):591-2.
4
A hierarchical latent response model for inferences about examinee engagement in terms of guessing and item-level non-response.
Br J Math Stat Psychol. 2020 Nov;73 Suppl 1:83-112. doi: 10.1111/bmsp.12188. Epub 2019 Nov 10.
5
Chance guessing in a forced-choice recognition task and the detection of malingering.
Neuropsychology. 2008 Mar;22(2):273-7. doi: 10.1037/0894-4105.22.2.273.
8
A Family of Generalized Diagnostic Classification Models for Multiple Choice Option-Based Scoring.
Appl Psychol Meas. 2015 Jan;39(1):62-79. doi: 10.1177/0146621614561315. Epub 2014 Dec 10.
9

本文引用的文献

1
PyMC: a modern, and comprehensive probabilistic programming framework in Python.
PeerJ Comput Sci. 2023 Sep 1;9:e1516. doi: 10.7717/peerj-cs.1516. eCollection 2023.
2
Self-monitoring accuracy does not increase throughout undergraduate medical education.
Med Educ. 2020 Apr;54(4):320-327. doi: 10.1111/medu.14057. Epub 2020 Mar 2.
4
The Incidence of Overconfidence and Underconfidence Effects in Medical Student Examinations.
J Surg Educ. 2018 Sep-Oct;75(5):1223-1229. doi: 10.1016/j.jsurg.2018.01.015. Epub 2018 Feb 1.
5
Sure, or unsure? Measuring students' confidence and the potential impact on patient safety in multiple-choice questions.
Med Teach. 2017 Nov;39(11):1189-1194. doi: 10.1080/0142159X.2017.1362103. Epub 2017 Aug 11.
7
Correlating student knowledge and confidence using a graded knowledge survey to assess student learning in a general microbiology classroom.
J Microbiol Biol Educ. 2014 Dec 15;15(2):251-8. doi: 10.1128/jmbe.v15i2.693. eCollection 2014 Dec.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验