Wang Sheng, Zhao Zihao, Ouyang Xi, Liu Tianming, Wang Qian, Shen Dinggang
School of Biomedical Engineering & State Key Laboratory of Advanced Medical Materials and Devices, ShanghaiTech University, Shanghai, China.
School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China.
Commun Eng. 2024 Sep 17;3(1):133. doi: 10.1038/s44172-024-00271-8.
Computer-aided diagnosis (CAD) has advanced medical image analysis, while large language models (LLMs) have shown potential in clinical applications. However, LLMs struggle to interpret medical images, which are critical for decision-making. Here we show a strategy integrating LLMs with CAD networks. The framework uses LLMs' medical knowledge and reasoning to enhance CAD network outputs, such as diagnosis, lesion segmentation, and report generation, by summarizing information in natural language. The generated reports are of higher quality and can improve the performance of vision-based CAD models. In chest X-rays, an LLM using ChatGPT improved diagnosis performance by 16.42 percentage points compared to state-of-the-art models, while GPT-3 provided a 15.00 percentage point F1-score improvement. Our strategy allows accurate report generation and creates a patient-friendly interactive system, unlike conventional CAD systems only understood by professionals. This approach has the potential to revolutionize clinical decision-making and patient communication.
计算机辅助诊断(CAD)推动了医学图像分析的发展,而大语言模型(LLMs)在临床应用中已展现出潜力。然而,大语言模型难以解读对决策至关重要的医学图像。在此,我们展示了一种将大语言模型与CAD网络相结合的策略。该框架利用大语言模型的医学知识和推理能力,通过以自然语言总结信息来增强CAD网络的输出,如诊断、病变分割和报告生成。生成的报告质量更高,能够提升基于视觉的CAD模型的性能。在胸部X光检查中,与最先进的模型相比,使用ChatGPT的大语言模型将诊断性能提高了16.42个百分点,而GPT-3使F1分数提高了15.00个百分点。与仅为专业人员所理解的传统CAD系统不同,我们的策略能够实现准确的报告生成,并创建一个对患者友好的交互式系统。这种方法有可能彻底改变临床决策和医患沟通。