Lim Bryan, Cevik Jevan, Seth Ishith, Sofiadellis Foti, Ross Richard J, Rozen Warren M, Cuomo Roberto
Department of Plastic Surgery, Peninsula Health, Melbourne, Victoria, 3199, Australia.
Faculty of Medicine, Monash University, Melbourne, Victoria, 3004, Australia.
JPRAS Open. 2024 Apr 5;40:273-285. doi: 10.1016/j.jpra.2024.03.010. eCollection 2024 Jun.
Artificial intelligence (AI) has the potential to transform preoperative planning for breast reconstruction by enhancing the efficiency, accuracy, and reliability of radiology reporting through automatic interpretation and perforator identification. Large language models (LLMs) have recently advanced significantly in medicine. This study aimed to evaluate the proficiency of contemporary LLMs in interpreting computed tomography angiography (CTA) scans for deep inferior epigastric perforator (DIEP) flap preoperative planning.
Four prominent LLMs, ChatGPT-4, BARD, Perplexity, and BingAI, answered six questions on CTA scan reporting. A panel of expert plastic surgeons with extensive experience in breast reconstruction assessed the responses using a Likert scale. In contrast, the responses' readability was evaluated using the Flesch Reading Ease score, the Flesch-Kincaid Grade level, and the Coleman-Liau Index. The DISCERN score was utilized to determine the responses' suitability. Statistical significance was identified through a t-test, and P-values < 0.05 were considered significant.
BingAI provided the most accurate and useful responses to prompts, followed by Perplexity, ChatGPT, and then BARD. BingAI had the greatest Flesh Reading Ease (34.7±5.5) and DISCERN (60.5±3.9) scores. Perplexity had higher Flesch-Kincaid Grade level (20.5±2.7) and Coleman-Liau Index (17.8±1.6) scores than other LLMs.
LLMs exhibit limitations in their capabilities of reporting CTA for preoperative planning of breast reconstruction, yet the rapid advancements in technology hint at a promising future. AI stands poised to enhance the education of CTA reporting and aid preoperative planning. In the future, AI technology could provide automatic CTA interpretation, enhancing the efficiency, accuracy, and reliability of CTA reports.
人工智能(AI)有潜力通过自动解读和穿支血管识别提高放射学报告的效率、准确性和可靠性,从而改变乳房重建的术前规划。大语言模型(LLMs)最近在医学领域取得了显著进展。本研究旨在评估当代大语言模型在解读计算机断层扫描血管造影(CTA)扫描以进行腹壁下深动脉穿支(DIEP)皮瓣术前规划方面的能力。
四个著名的大语言模型,ChatGPT - 4、BARD、Perplexity和BingAI,回答了六个关于CTA扫描报告的问题。一组在乳房重建方面有丰富经验的整形外科专家使用李克特量表对回答进行评估。相比之下,使用弗莱什易读性分数、弗莱什 - 金凯德年级水平和科尔曼 - 廖指数来评估回答的可读性。使用DISCERN分数来确定回答的适用性。通过t检验确定统计学显著性,P值<0.05被认为具有显著性。
BingAI对提示的回答最准确、最有用,其次是Perplexity、ChatGPT,然后是BARD。BingAI的弗莱什易读性(34.7±5.5)和DISCERN(60.5±3.9)分数最高。Perplexity的弗莱什 - 金凯德年级水平(20.5±2.7)和科尔曼 - 廖指数(17.8±1.6)分数高于其他大语言模型。
大语言模型在为乳房重建术前规划报告CTA的能力方面存在局限性,但技术的快速发展预示着一个充满希望的未来。人工智能有望加强CTA报告的教育并辅助术前规划。未来,人工智能技术可以提供自动CTA解读,提高CTA报告的效率、准确性和可靠性。