University of New Mexico School of Medicine, Albuquerque, NM, USA.
Max Stern College, Emek Yezreel, Israel.
Am J Surg. 2021 Dec;222(6):1051-1059. doi: 10.1016/j.amjsurg.2021.09.034. Epub 2021 Oct 2.
Letters of recommendation (LoRs) play an important role in resident selection. Author language varies implicitly toward male and female applicants. We examined gender bias in LoRs written for surgical residency candidates across three decades at one institution.
Retrospective analysis of LoRs written for general surgery residency candidates between 1980 and 2011 using artificial intelligence (AI) to conduct natural language processing (NLP) and sentiment analysis, and computer-based algorithms to detect gender bias. Applicants were grouped by scaled clerkship grades and USMLE scores. Data were analyzed among groups with t-tests, ANOVA, and non-parametric tests, as appropriate.
A total of 611 LoRs were analyzed for 171 applicants (16.4% female), and 95.3% of letter authors were male. Scaled USMLE scores and clerkship grades (SCG) were similar for both genders (p > 0.05 for both). Average word count for all letters was 290 words and was not significantly different between genders (p = 0.18). LoRs written before 2000 were significantly shorter than those written after, among applicants of both genders (female p = 0.004; male p < 0.001). Gender bias analysis of female LoRs revealed more gendered wording compared to male LoRs (p = 0.04) and was most prominent among females with lower SCG (9.5 vs 5.1, p = 0.01). Sentiment analysis revealed male LoRs with female authors had significantly more positive sentiment compared to female LoRs (p = 0.02), and males with higher SCG had more positive sentiment compared to those with lower SCG (9.4 vs 8.2, p = 0.03). NLP detected more "fear" in male LoRs with lower SCGs (0.11 vs 0.09, p = 0.02). Female LoRs with higher SCGs had more positive sentiment (0.78 vs 0.83, p = 0.03) and "joy" (0.60 vs 0.63, p = 0.02), although those written before 2000 had less "joy" (0.5 vs 0.63, p = 0.006).
AI and computer-based algorithms detected linguistic differences and gender bias in LoRs written for general surgery residency applicants, even following stratification by clerkship grades and when analyzed by decade.
推荐信在住院医师选拔中起着重要作用。作者的语言对男性和女性申请人有隐含的偏向。我们在一个机构的三十年中检查了外科住院医师候选人的推荐信中的性别偏见。
使用人工智能 (AI) 对 1980 年至 2011 年间普通外科住院医师候选人的推荐信进行回顾性分析,进行自然语言处理 (NLP) 和情感分析,以及基于计算机的算法检测性别偏见。根据标准化实习成绩和 USMLE 分数对申请人进行分组。使用 t 检验、方差分析和非参数检验适当地分析组间数据。
共分析了 171 名申请人的 611 封推荐信(女性占 16.4%),95.3%的推荐信作者为男性。男女申请人的标准化 USMLE 分数和实习成绩(SCG)相似(均为 p>0.05)。所有信件的平均字数为 290 字,男女之间没有显著差异(p=0.18)。在男女申请人中,2000 年前撰写的推荐信明显短于之后撰写的推荐信(女性 p=0.004;男性 p<0.001)。对女性推荐信的性别偏见分析显示,与男性推荐信相比,女性推荐信的性别措辞更多(p=0.04),且在 SCG 较低的女性中最为明显(9.5 与 5.1,p=0.01)。情感分析显示,女性作者撰写的男性推荐信的积极情绪明显高于女性推荐信(p=0.02),而 SCG 较高的男性比 SCG 较低的男性具有更积极的情绪(9.4 与 8.2,p=0.03)。NLP 检测到 SCG 较低的男性推荐信中存在更多的“恐惧”(0.11 与 0.09,p=0.02)。SCG 较高的女性推荐信具有更积极的情绪(0.78 与 0.83,p=0.03)和“喜悦”(0.60 与 0.63,p=0.02),尽管 2000 年前撰写的信件喜悦感较少(0.5 与 0.63,p=0.006)。
即使按照实习成绩分层,并按十年进行分析,人工智能和基于计算机的算法也能检测出普通外科住院医师申请人推荐信中的语言差异和性别偏见。