Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
Accreditation Council for Graduate Medical Education, Chicago, IL, USA.
J Gen Intern Med. 2024 Aug;39(10):1795-1802. doi: 10.1007/s11606-024-08645-6. Epub 2024 Jan 30.
While some prior studies of work-based assessment (WBA) numeric ratings have not shown gender differences, they have been unable to account for the true performance of the resident or explore narrative differences by gender.
To explore gender differences in WBA ratings as well as narrative comments (when scripted performance was known).
Secondary analysis of WBAs obtained from a randomized controlled trial of a longitudinal rater training intervention in 2018-2019. Participating faculty (n = 77) observed standardized resident-patient encounters and subsequently completed rater assessment forms (RAFs).
Participating faculty in longitudinal rater training.
Gender differences in mean entrustment ratings (4-point scale) were assessed with multivariable regression (adjusted for scripted performance, rater and resident demographics, and the interaction between study arm and time period [pre- versus post-intervention]). Using pre-specified natural language processing categories (masculine, feminine, agentic, and communal words), multivariable linear regression was used to determine associations of word use in the narrative comments with resident gender, race, and skill level, faculty demographics, and interaction between the study arm and the time period (pre- versus post-intervention).
Across 1527 RAFs, there were significant differences in entrustment ratings between women and men standardized residents (2.29 versus 2.54, respectively, p < 0.001) after correction for resident skill level. As compared to men, feminine terms were more common for comments of what the resident did poorly among women residents (β 0.45, CI 0.12-0.78, p 0.01). This persisted despite adjusting for the faculty's entrustment ratings. There were no other significant linguistic differences by gender.
Contrasting prior studies, we found entrustment rating differences in a simulated WBA which persisted after adjusting for the resident's scripted performance. There were also linguistic differences by gender after adjusting for entrustment ratings, with feminine terms being used more frequently in comments about women in some, but not all narrative comments.
虽然之前一些关于基于工作评估(WBA)数值评级的研究没有显示出性别差异,但它们无法说明居民的真实表现,也无法探究按性别划分的叙述差异。
探究 WBA 评级以及叙述性评论(在已知脚本表现的情况下)中的性别差异。
对 2018-2019 年一项纵向评估者培训干预的随机对照试验中获得的 WBA 进行二次分析。参与的教师(n=77)观察标准化的居民患者就诊,并随后完成评估者评估表(RAF)。
参与纵向评估者培训的教师。
使用多变量回归评估平均委托评分(4 分制)的性别差异(调整脚本表现、评估者和居民人口统计学、研究臂和时间段之间的相互作用[干预前与干预后])。使用预先指定的自然语言处理类别(男性化、女性化、代理和交际词),多变量线性回归用于确定叙述性评论中使用的词汇与居民性别、种族和技能水平、教师人口统计学以及研究臂和时间段之间的相互作用(干预前与干预后)的关联。
在 1527 份 RAF 中,标准化居民女性和男性的委托评分存在显著差异(分别为 2.29 和 2.54,p<0.001),校正居民技能水平后差异仍然存在。与男性相比,女性居民表现不佳时,评论中更常出现女性化的词汇(β0.45,CI 0.12-0.78,p=0.01)。即使在调整评估者的委托评分后,这种情况仍然存在。按性别没有其他显著的语言差异。
与之前的研究相比,我们在模拟 WBA 中发现了委托评分的差异,在调整居民的脚本表现后仍然存在。在调整委托评分后,按性别也存在语言差异,在某些但不是所有叙述性评论中,女性化词汇的使用频率更高。