Medical Genetics Branch, National Human Genome Research Institute, Bethesda, Maryland.
JAMA Netw Open. 2024 Mar 4;7(3):e242609. doi: 10.1001/jamanetworkopen.2024.2609.
The lack of standardized genetics training in pediatrics residencies, along with a shortage of medical geneticists, necessitates innovative educational approaches.
To compare pediatric resident recognition of Kabuki syndrome (KS) and Noonan syndrome (NS) after 1 of 4 educational interventions, including generative artificial intelligence (AI) methods.
DESIGN, SETTING, AND PARTICIPANTS: This comparative effectiveness study used generative AI to create images of children with KS and NS. From October 1, 2022, to February 28, 2023, US pediatric residents were provided images through a web-based survey to assess whether these images helped them recognize genetic conditions.
Participants categorized 20 images after exposure to 1 of 4 educational interventions (text-only descriptions, real images, and 2 types of images created by generative AI).
Associations between educational interventions with accuracy and self-reported confidence.
Of 2515 contacted pediatric residents, 106 and 102 completed the KS and NS surveys, respectively. For KS, the sensitivity of text description was 48.5% (128 of 264), which was not significantly different from random guessing (odds ratio [OR], 0.94; 95% CI, 0.69-1.29; P = .71). Sensitivity was thus compared for real images vs random guessing (60.3% [188 of 312]; OR, 1.52; 95% CI, 1.15-2.00; P = .003) and 2 types of generative AI images vs random guessing (57.0% [212 of 372]; OR, 1.32; 95% CI, 1.04-1.69; P = .02 and 59.6% [193 of 324]; OR, 1.47; 95% CI, 1.12-1.94; P = .006) (denominators differ according to survey responses). The sensitivity of the NS text-only description was 65.3% (196 of 300). Compared with text-only, the sensitivity of the real images was 74.3% (205 of 276; OR, 1.53; 95% CI, 1.08-2.18; P = .02), and the sensitivity of the 2 types of images created by generative AI was 68.0% (204 of 300; OR, 1.13; 95% CI, 0.77-1.66; P = .54) and 71.0% (247 of 328; OR, 1.30; 95% CI, 0.92-1.83; P = .14). For specificity, no intervention was statistically different from text only. After the interventions, the number of participants who reported being unsure about important diagnostic facial features decreased from 56 (52.8%) to 5 (7.6%) for KS (P < .001) and 25 (24.5%) to 4 (4.7%) for NS (P < .001). There was a significant association between confidence level and sensitivity for real and generated images.
In this study, real and generated images helped participants recognize KS and NS; real images appeared most helpful. Generated images were noninferior to real images and could serve an adjunctive role, particularly for rare conditions.
儿科住院医师缺乏标准化的遗传学培训,加上医学遗传学家的短缺,需要创新的教育方法。
比较儿科住院医师在接受 4 种教育干预措施之一(包括生成式人工智能方法)后对歌舞伎综合征(KS)和努南综合征(NS)的识别能力。
设计、设置和参与者:本比较效果研究使用生成式 AI 创建了 KS 和 NS 儿童的图像。从 2022 年 10 月 1 日至 2023 年 2 月 28 日,美国儿科住院医师通过网络调查获得了这些图像,以评估这些图像是否有助于他们识别遗传状况。
参与者在接触到 4 种教育干预措施之一(纯文本描述、真实图像和 2 种由生成式 AI 创建的图像)后对 20 张图像进行分类。
将教育干预措施与准确性和自我报告的信心相关联。
在联系的 2515 名儿科住院医师中,106 名和 102 名分别完成了 KS 和 NS 调查。对于 KS,文字描述的敏感性为 48.5%(264 个中的 128 个),与随机猜测没有显著差异(比值比 [OR],0.94;95%置信区间 [CI],0.69-1.29;P=0.71)。因此,将真实图像与随机猜测(60.3%[188 个中的 188 个];OR,1.52;95%CI,1.15-2.00;P=0.003)和 2 种生成式 AI 图像与随机猜测(57.0%[372 个中的 212 个];OR,1.32;95%CI,1.04-1.69;P=0.02 和 59.6%[324 个中的 193 个];OR,1.47;95%CI,1.12-1.94;P=0.006)进行比较(分母根据调查回答而有所不同)。KS 的纯文本描述的敏感性为 65.3%(300 个中的 196 个)。与纯文本相比,真实图像的敏感性为 74.3%(276 个中的 205 个;OR,1.53;95%CI,1.08-2.18;P=0.02),2 种由生成式 AI 创建的图像的敏感性为 68.0%(300 个中的 204 个;OR,1.13;95%CI,0.77-1.66;P=0.54)和 71.0%(328 个中的 247 个;OR,1.30;95%CI,0.92-1.83;P=0.14)。对于特异性,没有一种干预措施与纯文本有统计学差异。在干预措施后,认为重要诊断面部特征不确定的参与者人数从 KS 中的 56 名(52.8%)减少到 5 名(7.6%)(P<.001),NS 中的 25 名(24.5%)减少到 4 名(4.7%)(P<.001)。信心水平与真实和生成图像的敏感性之间存在显著关联。
在这项研究中,真实和生成的图像有助于参与者识别 KS 和 NS;真实图像似乎最有帮助。生成图像与真实图像相当,可以作为辅助手段,特别是对于罕见情况。