Ke Yu He, Jin Liyuan, Elangovan Kabilan, Abdullah Hairil Rizal, Liu Nan, Sia Alex Tiong Heng, Soh Chai Rick, Tung Joshua Yi Min, Ong Jasmine Chiat Ling, Kuo Chang-Fu, Wu Shao-Chun, Kovacheva Vesela P, Ting Daniel Shu Wei
Department of Anesthesiology, Singapore General Hospital, Singapore, Singapore.
Data Science and Artificial Intelligence Lab, Singapore General Hospital, Singapore, Singapore.
NPJ Digit Med. 2025 Apr 5;8(1):187. doi: 10.1038/s41746-025-01519-z.
Large Language Models (LLMs) hold promise for medical applications but often lack domain-specific expertise. Retrieval Augmented Generation (RAG) enables customization by integrating specialized knowledge. This study assessed the accuracy, consistency, and safety of LLM-RAG models in determining surgical fitness and delivering preoperative instructions using 35 local and 23 international guidelines. Ten LLMs (e.g., GPT3.5, GPT4, GPT4o, Gemini, Llama2, and Llama3, Claude) were tested across 14 clinical scenarios. A total of 3234 responses were generated and compared to 448 human-generated answers. The GPT4 LLM-RAG model with international guidelines generated answers within 20 s and achieved the highest accuracy, which was significantly better than human-generated responses (96.4% vs. 86.6%, p = 0.016). Additionally, the model exhibited an absence of hallucinations and produced more consistent output than humans. This study underscores the potential of GPT-4-based LLM-RAG models to deliver highly accurate, efficient, and consistent preoperative assessments.
大语言模型(LLMs)在医学应用方面具有潜力,但往往缺乏特定领域的专业知识。检索增强生成(RAG)通过整合专业知识实现定制化。本研究使用35项本地指南和23项国际指南,评估了LLM-RAG模型在确定手术适合性和提供术前指导方面的准确性、一致性和安全性。在14个临床场景中测试了10个大语言模型(如GPT3.5、GPT4、GPT4o、Gemini、Llama2和Llama3、Claude)。总共生成了3234个回答,并与448个人工生成的答案进行比较。使用国际指南的GPT4 LLM-RAG模型在20秒内生成答案,准确率最高,显著优于人工生成的回答(96.4%对86.6%,p = 0.016)。此外,该模型没有出现幻觉,输出比人类更一致。本研究强调了基于GPT-4的LLM-RAG模型在提供高度准确、高效和一致的术前评估方面的潜力。
PLOS Digit Health. 2024-8-21
J Med Internet Res. 2025-4-30
PLOS Digit Health. 2025-9-4
Bioengineering (Basel). 2025-8-21
Appl Artif Intell. 2025-6-18
PLOS Digit Health. 2025-6-11
J Med Internet Res. 2023-10-4
Anesth Pain Med (Seoul). 2023-7
Nat Med. 2023-8
Int J Qual Health Care. 2021-6-26
Nat Commun. 2020-1-13