IEEE Trans Med Imaging. 2024 Nov;43(11):3755-3766. doi: 10.1109/TMI.2024.3398350. Epub 2024 Nov 4.
The integration of Computer-Aided Diagnosis (CAD) with Large Language Models (LLMs) presents a promising frontier in clinical applications, notably in automating diagnostic processes akin to those performed by radiologists and providing consultations similar to a virtual family doctor. Despite the promising potential of this integration, current works face at least two limitations: (1) From the perspective of a radiologist, existing studies typically have a restricted scope of applicable imaging domains, failing to meet the diagnostic needs of different patients. Also, the insufficient diagnostic capability of LLMs further undermine the quality and reliability of the generated medical reports. (2) Current LLMs lack the requisite depth in medical expertise, rendering them less effective as virtual family doctors due to the potential unreliability of the advice provided during patient consultations. To address these limitations, we introduce ChatCAD+, to be universal and reliable. Specifically, it is featured by two main modules: (1) Reliable Report Generation and (2) Reliable Interaction. The Reliable Report Generation module is capable of interpreting medical images from diverse domains and generate high-quality medical reports via our proposed hierarchical in-context learning. Concurrently, the interaction module leverages up-to-date information from reputable medical websites to provide reliable medical advice. Together, these designed modules synergize to closely align with the expertise of human medical professionals, offering enhanced consistency and reliability for interpretation and advice. The source code is available at GitHub.
计算机辅助诊断 (CAD) 与大语言模型 (LLM) 的融合在临床应用中展现出了广阔的前景,特别是在自动化诊断流程方面,其功能类似于放射科医生的诊断过程,并且可以提供类似于虚拟家庭医生的咨询服务。尽管这种融合具有广阔的应用前景,但目前的工作至少面临两个限制:(1)从放射科医生的角度来看,现有的研究通常仅限于特定的成像领域,无法满足不同患者的诊断需求。此外,LLM 的诊断能力不足,进一步降低了生成的医疗报告的质量和可靠性。(2)当前的 LLM 在医疗专业知识方面深度不足,因此在作为虚拟家庭医生方面的效果欠佳,这是因为在患者咨询过程中提供的建议可能不够可靠。为了解决这些限制,我们引入了 ChatCAD+,使其具有通用性和可靠性。具体来说,它具有两个主要模块:(1)可靠的报告生成,(2)可靠的交互。可靠的报告生成模块能够解释来自不同领域的医学图像,并通过我们提出的分层上下文学习生成高质量的医疗报告。同时,交互模块利用来自知名医学网站的最新信息提供可靠的医疗建议。这些设计的模块协同工作,与人类医疗专业人员的专业知识密切一致,提供了更一致、更可靠的解释和建议。该代码的源代码可在 GitHub 上获取。