违反标准试题编写原则对考试及学生的影响：医学教育中使用有缺陷的试题对成绩考试的后果。

The effects of violating standard item writing principles on tests and students: the consequences of using flawed test items on achievement examinations in medical education.

作者信息

Downing Steven M

机构信息

Department of Medical Education (MC 591), College of Medicine, University of Illinois at Chicago, 60612-7309, USA.

出版信息

Adv Health Sci Educ Theory Pract. 2005;10(2):133-43. doi: 10.1007/s10459-004-4019-5.

DOI:10.1007/s10459-004-4019-5

PMID:16078098

Abstract

The purpose of this research was to study the effects of violations of standard multiple-choice item writing principles on test characteristics, student scores, and pass-fail outcomes. Four basic science examinations, administered to year-one and year-two medical students, were randomly selected for study. Test items were classified as either standard or flawed by three independent raters, blinded to all item performance data. Flawed test questions violated one or more standard principles of effective item writing. Thirty-six to sixty-five percent of the items on the four tests were flawed. Flawed items were 0-15 percentage points more difficult than standard items measuring the same construct. Over all four examinations, 646 (53%) students passed the standard items while 575 (47%) passed the flawed items. The median passing rate difference between flawed and standard items was 3.5 percentage points, but ranged from -1 to 35 percentage points. Item flaws had little effect on test score reliability or other psychometric quality indices. Results showed that flawed multiple-choice test items, which violate well established and evidence-based principles of effective item writing, disadvantage some medical students. Item flaws introduce the systematic error of construct-irrelevant variance to assessments, thereby reducing the validity evidence for examinations and penalizing some examinees.

摘要

本研究的目的是探讨违反标准多项选择题编写原则对考试特性、学生成绩及及格与否结果的影响。随机选取了针对一年级和二年级医学生进行的四项基础科学考试进行研究。由三位独立评分者将试题分类为标准试题或有缺陷试题，评分者对所有试题的表现数据均不知情。有缺陷的试题违反了一项或多项有效试题编写的标准原则。四项考试中36%至65%的试题存在缺陷。与测量相同内容的标准试题相比，有缺陷的试题难度高出0至15个百分点。在所有四项考试中，646名（53%）学生通过了标准试题，而575名（47%）学生通过了有缺陷的试题。有缺陷试题和标准试题的及格率中位数差异为3.5个百分点，但范围在-1至35个百分点之间。试题缺陷对考试分数的可靠性或其他心理测量质量指标影响不大。结果表明，违反成熟且有证据支持的有效试题编写原则的有缺陷多项选择题，对一些医学生不利。试题缺陷会给评估引入与内容无关的系统性误差，从而降低考试的效度证据，并对一些考生造成不利影响。

相似文献

The effects of violating standard item writing principles on tests and students: the consequences of using flawed test items on achievement examinations in medical education.违反标准试题编写原则对考试及学生的影响：医学教育中使用有缺陷的试题对成绩考试的后果。

Adv Health Sci Educ Theory Pract. 2005;10(2):133-43. doi: 10.1007/s10459-004-4019-5.

Impact of item-writing flaws in multiple-choice questions on student achievement in high-stakes nursing assessments.高风险护理评估中多项选择题的命题缺陷对学生成绩的影响。

Med Educ. 2008 Feb;42(2):198-206. doi: 10.1111/j.1365-2923.2007.02957.x.

The frequency of item writing flaws in multiple-choice questions used in high stakes nursing assessments.高风险护理评估中使用的多项选择题的题目编写缺陷频率。

Nurse Educ Today. 2006 Dec;26(8):662-71. doi: 10.1016/j.nedt.2006.07.006. Epub 2006 Oct 2.

Quality assurance of item writing: during the introduction of multiple choice questions in medicine for high stakes examinations.项目编写的质量保证：在高风险考试中引入医学多项选择题时。

Med Teach. 2009 Mar;31(3):238-43. doi: 10.1080/01421590802155597.

Do item-writing flaws reduce examinations psychometric quality?试题编写缺陷会降低考试的心理测量学质量吗？

BMC Res Notes. 2016 Aug 11;9(1):399. doi: 10.1186/s13104-016-2202-4.

A comparison of the psychometric properties of three- and four-option multiple-choice questions in nursing assessments.三种和四种选项多项选择题在护理评估中的心理测量特性比较。

Nurse Educ Today. 2010 Aug;30(6):539-43. doi: 10.1016/j.nedt.2009.11.002. Epub 2010 Jan 6.

Use of flawed multiple-choice items by the New England Journal of Medicine for continuing medical education.《新英格兰医学杂志》在继续医学教育中使用有缺陷的多项选择题。

Med Teach. 2006 Sep;28(6):566-8. doi: 10.1080/01421590600711153.

Use of a committee review process to improve the quality of course examinations.利用委员会审查程序提高课程考试质量。

Adv Health Sci Educ Theory Pract. 2006 Feb;11(1):61-8. doi: 10.1007/s10459-004-7515-8.

Education techniques for lifelong learning: writing multiple-choice questions for continuing medical education activities and self-assessment modules.终身学习的教育技巧：为继续医学教育活动和自我评估模块编写多项选择题。

Radiographics. 2006 Mar-Apr;26(2):543-51. doi: 10.1148/rg.262055145.

It takes only 100 true-false items to test medical students: true or false?只需100道是非题就能测试医学生：对还是错？

Med Teach. 2005 Aug;27(5):468-72. doi: 10.1080/01421590500097018.

引用本文的文献

Assessing LLM-generated vs. expert-created clinical anatomy MCQs: a student perception-based comparative study in medical education.评估大语言模型生成的与专家编写的临床解剖学多项选择题：医学教育中基于学生认知的比较研究。

Med Educ Online. 2025 Dec;30(1):2554678. doi: 10.1080/10872981.2025.2554678. Epub 2025 Aug 30.

Postgraduate Clinical Residency: The Impact of Multiple-Choice Question Quality on Exam Success Rates.研究生临床住院医师培训：选择题质量对考试成功率的影响。

Adv Med Educ Pract. 2025 Aug 6;16:1381-1397. doi: 10.2147/AMEP.S525828. eCollection 2025.

Evaluating the quality of multiple-choice question pilot database: A global educator-created tool for concept-based pharmacology learning.评估多选题预测试题库的质量：基于全球教育者创建的药理学概念学习工具。

Pharmacol Res Perspect. 2024 Oct;12(5):e70004. doi: 10.1002/prp2.70004.

A Meta-Analysis of the Reliability of Second Language Listening Tests (1991-2022).第二语言听力测试可靠性的元分析（1991 - 2022年）

Brain Sci. 2024 Jul 25;14(8):746. doi: 10.3390/brainsci14080746.

Comparison of Multiple-Choice Question Formats in a First Year Medical Physiology Course.一年级医学生理学课程中选择题形式的比较

J CME. 2024 Aug 12;13(1):2390264. doi: 10.1080/28338073.2024.2390264. eCollection 2024.

Implementation of the São Paulo Nursing Courses Consortium for the Progress Test: experience report.圣保罗护理课程联盟实施进展测试：经验报告。

Rev Esc Enferm USP. 2024 Jun 28;58:e20230347. doi: 10.1590/1980-220X-REEUSP-2023-0347en. eCollection 2024.

Nurturing Untapped Integration Expertise of MS4 Assessment Writers.培养医学四年级学生评估编写者未开发的整合专业知识。

Med Sci Educ. 2024 Jan 13;34(2):315-318. doi: 10.1007/s40670-024-01974-5. eCollection 2024 Apr.

Questioning the questions: Methods used by medical schools to review internal assessment items.质疑这些问题：医学院校用于审查内部评估项目的方法。

MedEdPublish (2016). 2021 Feb 5;10:37. doi: 10.15694/mep.2021.000037.1. eCollection 2021.

Postexamination item analysis of undergraduate pediatric multiple-choice questions exam: implications for developing a validated question Bank.本科儿科选择题考试的考后项目分析：对建立一个有效的题库的启示。

BMC Med Educ. 2024 Feb 21;24(1):168. doi: 10.1186/s12909-024-05153-3.

Examining the impact of specific types of item-writing flaws on student performance and psychometric properties of the multiple choice question.考察特定类型的命题缺陷对学生成绩以及多项选择题心理测量特性的影响。

MedEdPublish (2016). 2018 Oct 2;7:225. doi: 10.15694/mep.2018.0000225.1. eCollection 2018.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

违反标准试题编写原则对考试及学生的影响：医学教育中使用有缺陷的试题对成绩考试的后果。

The effects of violating standard item writing principles on tests and students: the consequences of using flawed test items on achievement examinations in medical education.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献