Andrew Kevlian, Montalbano Michael J
Department of Anatomical Sciences, St. George's University, True Blue, St. George's, Grenada.
Med Sci Educ. 2025 Feb 27;35(3):1473-1488. doi: 10.1007/s40670-025-02332-9. eCollection 2025 Jun.
Medical education professionals expect artificial intelligence (AI) systems to be an efficient faculty resource for content creation. However, prior findings suggest that machine learning algorithms may exacerbate negative stereotypes and undermine efforts for diversity, equity, and inclusivity. This investigation explores the potential of OpenAI's ChatGPT (OCG) and Microsoft's Bing A.I. Image Creator (MBIC) to perpetuate ethnoracial stereotypes in medical cases.
A series of medically relevant vignettes and visual representatives were requested from ChatGPT and MBIC for five medical conditions traditionally associated with certain ethnoracial groups: sickle cell anemia, cystic fibrosis, Tay-Sachs disease, beta-thalassemia, and aldehyde dehydrogenase deficiency. Initial prompting, self-prompting, and prompt engineering were iteratively performed to ascertain the extent to which AI outputs for generated vignettes and imagery were mutable or fixed.
The ethnoracial identity in the vignettes of the clinical conditions adhered more closely than described in epidemiologic studies. Following prompt engineering and self-prompting, an increase in diversity was seen. On initial prompting, the most common ethnoracial identity depicted was Caucasian. Secondary prompting resulted in less diversity with higher conformation to the traditionally expected ethnoracial identity.
The prevalence of dataset bias and AI's user-dependent learning abilities underscore the importance of human stewardship. The increasing use of AI in generating medical education content, like MCQs, demands vigilant use of such tools to combat the reinforcement of the race-based stereotypes in medicine.
医学教育专业人员期望人工智能(AI)系统成为内容创作的高效师资资源。然而,先前的研究结果表明,机器学习算法可能会加剧负面刻板印象,并破坏多样性、公平性和包容性方面的努力。本调查探讨了OpenAI的ChatGPT(OCG)和微软的必应人工智能图像生成器(MBIC)在医疗案例中延续种族刻板印象的可能性。
针对与某些种族群体传统相关的五种医疗状况,向ChatGPT和MBIC索要了一系列与医学相关的病例 vignettes 和视觉代表:镰状细胞贫血、囊性纤维化、泰-萨克斯病、β-地中海贫血和醛脱氢酶缺乏症。反复进行初始提示、自我提示和提示工程,以确定生成的 vignettes 和图像的人工智能输出在多大程度上是可变的或固定的。
临床状况 vignettes 中的种族身份比流行病学研究中描述的更为紧密地遵循传统认知。经过提示工程和自我提示后,多样性有所增加。在初始提示时,描绘的最常见种族身份是白种人。二次提示导致多样性减少,与传统预期的种族身份一致性更高。
数据集偏差的普遍性和人工智能依赖用户的学习能力凸显了人类管理的重要性。在生成医学教育内容(如多项选择题)中越来越多地使用人工智能,要求谨慎使用此类工具,以对抗医学中基于种族的刻板印象的强化。