• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

效度威胁:克服对评估数据既定解释的干扰

Validity threats: overcoming interference with proposed interpretations of assessment data.

作者信息

Downing Steven M, Haladyna Thomas M

机构信息

University of Illinois at Chicago, College of Medicine, Department of Medical Education, Chicago, Illinois 60612-7309, USA.

出版信息

Med Educ. 2004 Mar;38(3):327-33. doi: 10.1046/j.1365-2923.2004.01777.x.

DOI:10.1046/j.1365-2923.2004.01777.x
PMID:14996342
Abstract

CONTEXT

Factors that interfere with the ability to interpret assessment scores or ratings in the proposed manner threaten validity. To be interpreted in a meaningful manner, all assessments in medical education require sound, scientific evidence of validity.

PURPOSE

The purpose of this essay is to discuss 2 major threats to validity: construct under-representation (CU) and construct-irrelevant variance (CIV). Examples of each type of threat for written, performance and clinical performance examinations are provided.

DISCUSSION

The CU threat to validity refers to undersampling the content domain. Using too few items, cases or clinical performance observations to adequately generalise to the domain represents CU. Variables that systematically (rather than randomly) interfere with the ability to meaningfully interpret scores or ratings represent CIV. Issues such as flawed test items written at inappropriate reading levels or statistically biased questions represent CIV in written tests. For performance examinations, such as standardised patient examinations, flawed cases or cases that are too difficult for student ability contribute CIV to the assessment. For clinical performance data, systematic rater error, such as halo or central tendency error, represents CIV. The term face validity is rejected as representative of any type of legitimate validity evidence, although the fact that the appearance of the assessment may be an important characteristic other than validity is acknowledged.

CONCLUSIONS

There are multiple threats to validity in all types of assessment in medical education. Methods to eliminate or control validity threats are suggested.

摘要

背景

以提议的方式干扰解释评估分数或评级能力的因素会威胁效度。为了以有意义的方式进行解释,医学教育中的所有评估都需要可靠的、科学的效度证据。

目的

本文的目的是讨论效度的两大威胁:结构代表性不足(CU)和结构无关变异(CIV)。提供了书面考试、实践考试和临床实践考试中每种威胁类型的示例。

讨论

效度的CU威胁是指对内容领域的抽样不足。使用太少的题目、病例或临床实践观察结果来充分概括该领域就代表了CU。系统地(而非随机地)干扰有意义地解释分数或评级能力的变量代表CIV。诸如以不适当阅读水平编写的有缺陷的题目或存在统计偏差的问题等,在书面考试中代表CIV。对于实践考试,如标准化病人考试,有缺陷的病例或对学生能力而言太难的病例会给评估带来CIV。对于临床实践数据,系统的评分者误差,如光环效应或集中趋势误差,代表CIV。尽管承认评估的外观可能是除效度之外的一个重要特征,但“表面效度”一词不被视为任何类型的合法效度证据。

结论

医学教育中所有类型的评估都存在多种效度威胁。文中提出了消除或控制效度威胁的方法。

相似文献

1
Validity threats: overcoming interference with proposed interpretations of assessment data.效度威胁:克服对评估数据既定解释的干扰
Med Educ. 2004 Mar;38(3):327-33. doi: 10.1046/j.1365-2923.2004.01777.x.
2
Reliability: on the reproducibility of assessment data.可靠性:关于评估数据的可重复性。
Med Educ. 2004 Sep;38(9):1006-12. doi: 10.1111/j.1365-2929.2004.01932.x.
3
Threats to the validity of locally developed multiple-choice tests in medical education: construct-irrelevant variance and construct underrepresentation.医学教育中本地开发的多项选择题测试有效性面临的威胁:与结构无关的方差和结构代表性不足。
Adv Health Sci Educ Theory Pract. 2002;7(3):235-41. doi: 10.1023/a:1021112514626.
4
Validity: on meaningful interpretation of assessment data.效度:关于评估数据的有意义解释。
Med Educ. 2003 Sep;37(9):830-7. doi: 10.1046/j.1365-2923.2003.01594.x.
5
The effects of violating standard item writing principles on tests and students: the consequences of using flawed test items on achievement examinations in medical education.违反标准试题编写原则对考试及学生的影响:医学教育中使用有缺陷的试题对成绩考试的后果。
Adv Health Sci Educ Theory Pract. 2005;10(2):133-43. doi: 10.1007/s10459-004-4019-5.
6
Comprehensive undergraduate medical assessments improve prediction of clinical performance.综合性本科医学评估可改善对临床能力的预测。
Med Educ. 2004 Oct;38(10):1111-6. doi: 10.1111/j.1365-2929.2004.01962.x.
7
In Brief: Validity of Case Summaries in Written Examinations of Clinical Reasoning.简而言之:临床推理笔试中病例摘要的有效性。
Teach Learn Med. 2016 Oct-Dec;28(4):375-384. doi: 10.1080/10401334.2016.1190730. Epub 2016 Jun 13.
8
Should essays and other "open-ended"-type questions retain a place in written summative assessment in clinical medicine?论文及其他“开放式”问题在临床医学书面总结性评估中是否应保留一席之地?
BMC Med Educ. 2014 Nov 28;14:249. doi: 10.1186/s12909-014-0249-2.
9
An empirical study of the predictive validity of number grades in medical school using 3 decades of longitudinal data: implications for a grading system.一项利用30年纵向数据对医学院数字评分预测效度的实证研究:对评分系统的启示
Med Educ. 2004 Apr;38(4):425-34. doi: 10.1111/j.1365-2923.2004.01774.x.
10
Construct-irrelevant variance and flawed test questions: Do multiple-choice item-writing principles make any difference?与结构无关的方差和有缺陷的测试问题:多项选择题编写原则有作用吗?
Acad Med. 2002 Oct;77(10 Suppl):S103-4. doi: 10.1097/00001888-200210001-00032.

引用本文的文献

1
Evaluating construct validity of virtual osces in exceptional conditions.评估特殊情况下虚拟客观结构化临床考试的结构效度。
BMC Med Educ. 2025 Jun 5;25(1):841. doi: 10.1186/s12909-025-07383-5.
2
The pattern of reporting and presenting validity evidence of extended matching questions (EMQs) in health professions education: a systematic review.报告和呈现健康职业教育中扩展匹配题(EMQs)有效性证据的模式:系统评价。
Med Educ Online. 2024 Dec 31;29(1):2412392. doi: 10.1080/10872981.2024.2412392. Epub 2024 Oct 24.
3
Legitimation Without Argumentation: An Empirical Discourse Analysis of 'Validity as an Argument' in Assessment.
无需论证的合法化:评估中“有效性即论证”的实证话语分析。
Perspect Med Educ. 2024 Oct 3;13(1):469-480. doi: 10.5334/pme.1404. eCollection 2024.
4
Twelve tips for introducing the concept of validity argument in assessment to novice medical teachers in a workshop.在研讨会上向新手医学教师介绍评估中效度论证概念的十二条建议。
MedEdPublish (2016). 2021 Sep 21;10:74. doi: 10.15694/mep.2021.000074.2. eCollection 2021.
5
Trainee-supervisor collaboration, progress-visualisation, and coaching: a survey on challenges in assessment of ICU trainees.带教-受训者协作、进展可视化和指导:一项 ICU 受训者评估挑战的调查。
BMC Med Educ. 2024 Feb 6;24(1):120. doi: 10.1186/s12909-023-04980-0.
6
Validity of a virtual reality endoscopic retrograde cholangiopancreatography simulator: can it distinguish experts from novices?虚拟现实内镜逆行胰胆管造影模拟器的有效性:它能区分专家和新手吗?
Front Surg. 2023 Dec 6;10:1289197. doi: 10.3389/fsurg.2023.1289197. eCollection 2023.
7
Modern Assessments of Intelligence Must Be Fair and Equitable.现代智力评估必须公平公正。
J Intell. 2023 Jun 20;11(6):126. doi: 10.3390/jintelligence11060126.
8
Formative Objective Structured Clinical Examinations (OSCEs) as an Assessment Tool in UK Undergraduate Medical Education: A Review of Its Utility.形成性客观结构化临床考试(OSCEs)作为英国本科医学教育中的一种评估工具:其效用综述
Cureus. 2023 May 4;15(5):e38519. doi: 10.7759/cureus.38519. eCollection 2023 May.
9
TRainee Attributable & Automatable Care Evaluations in Real-time (TRACERs): A Scalable Approach for Linking Education to Patient Care.培训生可归因和可自动化护理评估实时系统(TRACERs):将教育与患者护理联系起来的可扩展方法。
Perspect Med Educ. 2023 May 17;12(1):149-159. doi: 10.5334/pme.1013. eCollection 2023.
10
The Pain Medicine Curriculum Framework-structured integration of pain medicine education into the medical curriculum.疼痛医学课程框架——将疼痛医学教育结构化整合到医学课程中。
Front Pain Res (Lausanne). 2023 Jan 9;3:1057114. doi: 10.3389/fpain.2022.1057114. eCollection 2022.