Obstetrical Unit, Shamir Medical Center (formerly Assaf Harofeh Medical Center), Zerifin, Israel.
Affiliated to the Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel.
Fetal Diagn Ther. 2024;51(5):474-477. doi: 10.1159/000539658. Epub 2024 Jun 4.
OpenAI's GPT-4 (artificial intelligence [AI]) is being studied for its use as a medical decision support tool. This research examines its accuracy in refining referrals for fetal echocardiography (FE) to improve early detection and outcomes related to congenital heart defects (CHDs).
Past FE data referred to our institution were evaluated separately by pediatric cardiologist, gynecologist (human experts [experts]), and AI, according to established guidelines. We compared experts and AI's agreement on referral necessity, with experts addressing discrepancies.
Total of 59 FE cases were addressed retrospectively. Cardiologist, gynecologist, and AI recommended performing FE in 47.5%, 49.2%, and 59.0% of cases, respectively. Comparing AI recommendations to experts indicated agreement of around 80.0% with both experts (p < 0.001). Notably, AI suggested more echocardiographies for minor CHD (64.7%) compared to experts (47.1%), and for major CHD, experts recommended performing FE in all cases (100%) while AI recommended in majority of cases (90.9%). Discrepancies between AI and experts are detailed and reviewed.
The evaluation found moderate agreement between AI and experts. Contextual misunderstandings and lack of specialized medical knowledge limit AI, necessitating clinical guideline guidance. Despite shortcomings, AI's referrals comprised 65% of minor CHD cases versus experts 47%, suggesting its potential as a cautious decision aid for clinicians.
OpenAI 的 GPT-4(人工智能)正在被研究作为一种医疗决策支持工具。这项研究旨在评估其在改进胎儿超声心动图(FE)转诊以提高先天性心脏病(CHD)早期检测和相关结果方面的准确性。
根据既定指南,分别由儿科心脏病专家、妇产科医生(人类专家[专家])和 AI 对既往转诊至我院的 FE 数据进行评估。我们比较了专家和 AI 在转诊必要性上的一致性,由专家解决差异。
共回顾性处理了 59 例 FE 病例。心脏病专家、妇产科医生和 AI 分别建议在 47.5%、49.2%和 59.0%的病例中进行 FE。将 AI 的建议与专家进行比较,结果表明 AI 与两位专家的一致性约为 80.0%(p < 0.001)。值得注意的是,AI 建议对轻度 CHD 进行更多超声心动检查(64.7%),而专家建议(47.1%);对于重度 CHD,专家建议所有病例(100%)进行 FE,而 AI 建议多数病例(90.9%)进行。对 AI 和专家之间的差异进行了详细审查和回顾。
评估发现 AI 与专家之间存在中等程度的一致性。上下文误解和缺乏专业医学知识限制了 AI,需要临床指南的指导。尽管存在缺陷,但 AI 的转诊包括 65%的轻度 CHD 病例,而专家为 47%,这表明其作为临床医生谨慎决策辅助工具的潜力。