Gisselbaek Mia, Minsart Laurens, Köselerli Ekin, Suppan Mélanie, Meco Basak Ceyda, Seidel Laurence, Albert Adelin, Barreto Chang Odmara L, Saxena Sarah, Berger-Estilita Joana
Division of Anesthesiology, Department of Anesthesiology, Clinical Pharmacology, Intensive Care and Emergency Medicine, Faculty of Medicine, Geneva University Hospitals, Geneva, Switzerland.
Department of Anesthesia, Antwerp University Hospital, Edegem, Belgium.
Front Artif Intell. 2024 Oct 9;7:1462819. doi: 10.3389/frai.2024.1462819. eCollection 2024.
Artificial Intelligence (AI) is increasingly being integrated into anesthesiology to enhance patient safety, improve efficiency, and streamline various aspects of practice.
This study aims to evaluate whether AI-generated images accurately depict the demographic racial and ethnic diversity observed in the Anesthesia workforce and to identify inherent social biases in these images.
This cross-sectional analysis was conducted from January to February 2024. Demographic data were collected from the American Society of Anesthesiologists (ASA) and the European Society of Anesthesiology and Intensive Care (ESAIC). Two AI text-to-image models, ChatGPT DALL-E 2 and Midjourney, generated images of anesthesiologists across various subspecialties. Three independent reviewers assessed and categorized each image based on sex, race/ethnicity, age, and emotional traits.
A total of 1,200 images were analyzed. We found significant discrepancies between AI-generated images and actual demographic data. The models predominantly portrayed anesthesiologists as White, with ChatGPT DALL-E2 at 64.2% and Midjourney at 83.0%. Moreover, male gender was highly associated with White ethnicity by ChatGPT DALL-E2 (79.1%) and with non-White ethnicity by Midjourney (87%). Age distribution also varied significantly, with younger anesthesiologists underrepresented. The analysis also revealed predominant traits such as "masculine, ""attractive, "and "trustworthy" across various subspecialties.
AI models exhibited notable biases in gender, race/ethnicity, and age representation, failing to reflect the actual diversity within the anesthesiologist workforce. These biases highlight the need for more diverse training datasets and strategies to mitigate bias in AI-generated images to ensure accurate and inclusive representations in the medical field.
人工智能(AI)正越来越多地融入麻醉学领域,以提高患者安全性、提升效率并简化实践的各个方面。
本研究旨在评估人工智能生成的图像是否准确描绘了麻醉专业人员中观察到的人口统计学种族和民族多样性,并识别这些图像中固有的社会偏见。
本横断面分析于2024年1月至2月进行。人口统计学数据收集自美国麻醉医师协会(ASA)和欧洲麻醉与重症监护学会(ESAIC)。两个人工智能文本到图像模型ChatGPT DALL-E 2和Midjourney生成了各个亚专业麻醉医师的图像。三名独立评审员根据性别、种族/民族、年龄和情感特征对每张图像进行评估和分类。
共分析了1200张图像。我们发现人工智能生成的图像与实际人口统计学数据之间存在显著差异。这些模型主要将麻醉医师描绘为白人,ChatGPT DALL-E2为64.2%,Midjourney为83.0%。此外,ChatGPT DALL-E2显示男性与白人种族高度相关(79.1%),而Midjourney显示男性与非白人种族高度相关(87%)。年龄分布也有显著差异,年轻麻醉医师的代表性不足。分析还揭示了各个亚专业中普遍存在的“男性化”“有吸引力”和“值得信赖”等特征。
人工智能模型在性别、种族/民族和年龄呈现方面表现出明显偏见,未能反映麻醉医师队伍中的实际多样性。这些偏见凸显了需要更多样化的训练数据集和策略来减轻人工智能生成图像中的偏见,以确保医学领域的准确和包容性呈现。