• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

医学院校的临床前测试是否存在性别和种族偏见?一项差异项目功能分析。

Are medical school preclinical tests biased for sex and race? A differential item functioning analysis.

作者信息

Dale Esther Dasari, Abulela Mohammed A A, Jia Hao, Violato Claudio

机构信息

University of Minnesota Medical School, 420 Delaware Street SE, Mayo Building, Minneapolis, MN, 55455, USA.

Department of Educational Psychology, University of Minnesota, Minneapolis, MN, 55455, USA.

出版信息

BMC Med Educ. 2025 Jan 29;25(1):146. doi: 10.1186/s12909-024-06540-6.

DOI:10.1186/s12909-024-06540-6
PMID:39881271
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11780802/
Abstract

BACKGROUND

A common practice in assessment development, fundamental for fairness and consequently the validity of test score interpretations and uses, is to ascertain whether test items function equally across test-taker groups. Accordingly, we conducted differential item functioning (DIF) analysis, a psychometric procedure for detecting potential item bias, for three preclinical medical school foundational courses based on students' sex and race.

METHODS

The sample included 520, 519, and 344 medical students for anatomy, histology, and physiology, respectively, collected from 2018 to 2020. To conduct DIF analysis, we used the Wald test based on the two-parameter logistic model as utilized in the IRTPRO software.

RESULTS

The three assessments had as many as one-fifth of the items that functioned statistically differentially across one or more of the variables sex and race: 10 out of 49 items (20%), six out of 40 items (15%), 5 out of 45 items (11%) showed statistically significant DIF for Anatomy, Histology, and Physiology courses, respectively. Measurement specialists and subject matter experts independently reviewed the items to identify construct-irrelevant factors as potential sources for DIF as demonstrated in Appendix A. Most identified items were generally poorly written or had unclear images.

CONCLUSIONS

The validity of score-based inferences, particularly for group comparisons, requires test items to function equally across test-taker groups. In the present study, we found DIF of some items for sex and race in three content areas. The present approach should be utilized in other medical schools to address the generalizability of the present findings. Item level DIF should also be routinely conducted as part of psychometric analyses for basic sciences courses and other assessments.

CLINICAL TRIAL NUMBER

Not applicable.

摘要

背景

在评估开发中,一项常见做法是确定测试项目在不同考生群体中是否具有同等功能,这对于确保公平性以及测试分数解释和使用的有效性至关重要。因此,我们基于学生的性别和种族,对医学院三个临床前基础课程进行了差异项目功能(DIF)分析,这是一种用于检测潜在项目偏差的心理测量程序。

方法

样本分别包括2018年至2020年收集的520名、519名和344名解剖学、组织学和生理学专业的医学生。为了进行DIF分析,我们使用了IRTPRO软件中基于双参数逻辑模型的Wald检验。

结果

这三项评估中,多达五分之一的项目在性别和种族中的一个或多个变量上存在统计学差异:解剖学课程的49个项目中有10个(20%)、组织学课程的40个项目中有6个(15%)、生理学课程的45个项目中有5个(11%)显示出统计学上显著的DIF。测量专家和学科专家独立审查了这些项目,以确定与结构无关的因素作为DIF的潜在来源,如附录A所示。大多数确定的项目通常编写不佳或图像不清晰。

结论

基于分数的推断的有效性,特别是对于组间比较,要求测试项目在不同考生群体中具有同等功能。在本研究中,我们在三个内容领域发现了一些项目在性别和种族方面存在DIF。本方法应在其他医学院校中使用,以检验本研究结果的普遍性。项目层面的DIF也应作为基础科学课程和其他评估的心理测量分析的一部分定期进行。

临床试验编号

不适用。

相似文献

1
Are medical school preclinical tests biased for sex and race? A differential item functioning analysis.医学院校的临床前测试是否存在性别和种族偏见?一项差异项目功能分析。
BMC Med Educ. 2025 Jan 29;25(1):146. doi: 10.1186/s12909-024-06540-6.
2
Modern psychometric methods for detection of differential item functioning: application to cognitive assessment measures.用于检测项目功能差异的现代心理测量方法:在认知评估测量中的应用。
Stat Med. 2000;19(11-12):1651-83. doi: 10.1002/(sici)1097-0258(20000615/30)19:11/12<1651::aid-sim453>3.0.co;2-h.
3
Using generalizability analysis to estimate parameters for anatomy assessments: A multi-institutional study.运用可推广性分析来估计解剖学评估的参数:一项多机构研究。
Anat Sci Educ. 2017 Mar;10(2):109-119. doi: 10.1002/ase.1631. Epub 2016 Jul 26.
4
Examination of the Measurement Equivalence of the Functional Assessment in Acute Care MCAT (FAMCAT) Mobility Item Bank Using Differential Item Functioning Analyses.使用差异项目功能分析检验急性护理 MCAT(FAMCAT)移动项目库中功能评估的测量等效性。
Arch Phys Med Rehabil. 2022 May;103(5S):S84-S107.e38. doi: 10.1016/j.apmr.2021.03.044. Epub 2021 Jun 16.
5
Measurement Equivalence of the Patient Reported Outcomes Measurement Information System (PROMIS) Anxiety Short Forms in Ethnically Diverse Groups.患者报告结局测量信息系统(PROMIS)焦虑简表在不同种族群体中的测量等效性
Psychol Test Assess Model. 2016;58(1):183-219.
6
Psychometric Properties and Performance of the Patient Reported Outcomes Measurement Information System (PROMIS) Depression Short Forms in Ethnically Diverse Groups.患者报告结局测量信息系统(PROMIS)抑郁简表在不同种族群体中的心理测量特性及表现
Psychol Test Assess Model. 2016;58(1):141-181.
7
Analysis of Race and Sex Bias in the Autism Diagnostic Observation Schedule (ADOS-2).自闭症诊断观察量表(ADOS-2)中的种族和性别偏见分析。
JAMA Netw Open. 2022 Apr 1;5(4):e229498. doi: 10.1001/jamanetworkopen.2022.9498.
8
Measurement Equivalence of the Patient Reported Outcomes Measurement Information System (PROMIS) Applied Cognition - General Concerns, Short Forms in Ethnically Diverse Groups.患者报告结局测量信息系统(PROMIS)应用认知量表在不同种族群体中的测量等效性——一般问题及简表
Psychol Test Assess Model. 2016;58(2):255-307.
9
Validating a multiple mini-interview question bank assessing entry-level reasoning skills in candidates for graduate-entry medicine and dentistry programmes.验证一个多迷你面试题库,该题库用于评估申请研究生入学医学和牙科学项目的考生的入门级推理能力。
Med Educ. 2009 Apr;43(4):350-9. doi: 10.1111/j.1365-2923.2009.03292.x.
10
Can ChatGPT Generate Acceptable Case-Based Multiple-Choice Questions for Medical School Anatomy Exams? A Pilot Study on Item Difficulty and Discrimination.ChatGPT能否生成适用于医学院解剖学考试的基于案例的多项选择题?关于题目难度和区分度的初步研究。
Clin Anat. 2025 May;38(4):505-510. doi: 10.1002/ca.24271. Epub 2025 Mar 24.

本文引用的文献

1
Differential Item Functioning Analysis of United States Medical Licensing Examination Step 1 Items.美国医师执照考试第一步项目的差异项目功能分析
Acad Med. 2022 May 1;97(5):718-722. doi: 10.1097/ACM.0000000000004567. Epub 2022 Apr 27.
2
Multivariable analysis of factors associated with USMLE scores across U.S. medical schools.多变量分析与美国医学院 USMLE 分数相关的因素。
BMC Med Educ. 2019 May 20;19(1):154. doi: 10.1186/s12909-019-1605-z.
3
Motivation and academic performance of medical students from ethnic minorities and majority: a comparative study.少数民族和多数族裔医学生的学习动机和学业表现:一项比较研究。
BMC Med Educ. 2017 Nov 28;17(1):233. doi: 10.1186/s12909-017-1079-9.
4
Checking Equity: Why Differential Item Functioning Analysis Should Be a Routine Part of Developing Conceptual Assessments.核查公平性:为何差异项目功能分析应成为概念评估开发的常规部分。
CBE Life Sci Educ. 2017 Summer;16(2). doi: 10.1187/cbe.16-10-0307.
5
Cognitive Difficulty and Format of Exams Predicts Gender and Socioeconomic Gaps in Exam Performance of Students in Introductory Biology Courses.认知难度和考试形式预测了生物学入门课程学生考试成绩中的性别和社会经济差距。
CBE Life Sci Educ. 2016 Summer;15(2). doi: 10.1187/cbe.15-12-0246.
6
Focusing on the Formative: Building an Assessment System Aimed at Student Growth and Development.聚焦形成性评价:构建旨在促进学生成长与发展的评估体系。
Acad Med. 2016 Nov;91(11):1492-1497. doi: 10.1097/ACM.0000000000001171.
7
A meta-analysis of the educational effectiveness of three-dimensional visualization technologies in teaching anatomy.三维可视化技术在解剖学教学中教育效果的荟萃分析。
Anat Sci Educ. 2015 Nov-Dec;8(6):525-38. doi: 10.1002/ase.1510. Epub 2014 Dec 31.
8
Assessment of clinical skills with standardized patients: state of the art revisited.使用标准化病人评估临床技能:重新审视当前的技术水平
Teach Learn Med. 2013;25 Suppl 1:S17-25. doi: 10.1080/10401334.2013.842916.
9
Gender differences in worry during medical school.医学院学生的担忧存在性别差异。
Med Educ. 2013 Sep;47(9):932-41. doi: 10.1111/medu.12236.
10
Gender differences in learning styles and academic performance of medical students in Saudi Arabia.沙特阿拉伯医学生学习风格和学业表现的性别差异。
Med Teach. 2013;35 Suppl 1:S78-82. doi: 10.3109/0142159X.2013.765545.