Qin Yuanyuan, Chang Jianming, Li Li, Wu Mianhua
First Clinical Medical College, Nanjing University of Chinese Medicine, Nanjing, China.
Jiangsu Collaborative Innovation Center of Traditional Chinese Medicine Prevention and Treatment of Tumor, Nanjing University of Chinese Medicine, Nanjing, China.
Front Med (Lausanne). 2025 May 21;12:1583514. doi: 10.3389/fmed.2025.1583514. eCollection 2025.
Advancements in artificial intelligence (AI) and large language models (LLMs) have the potential to revolutionize digestive endoscopy by enhancing diagnostic accuracy, improving procedural efficiency, and supporting clinical decision-making. Traditional AI-assisted endoscopic systems often rely on single-modal image analysis, which lacks contextual understanding and adaptability to complex gastrointestinal (GI) conditions. Moreover, existing methods struggle with domain shifts, data heterogeneity, and interpretability, limiting their clinical applicability.
To address these challenges, we propose a multimodal learning framework that integrates LLM-powered chatbots with endoscopic imaging and patient-specific medical data. Our approach employs self-supervised learning to extract clinically relevant patterns from heterogeneous sources, enabling real-time guidance and AI-assisted report generation. We introduce a domain-adaptive learning strategy to enhance model generalization across diverse patient populations and imaging conditions.
Experimental results on multiple GI datasets demonstrate that our method significantly improves lesion detection, reduces diagnostic variability, and enhances physician-AI collaboration. This study highlights the potential of multimodal LLM-based systems in advancing gastroenterology by providing interpretable, context-aware, and adaptable AI support in digestive endoscopy.
人工智能(AI)和大语言模型(LLMs)的进步有可能通过提高诊断准确性、改善操作效率和支持临床决策,给消化内镜检查带来变革。传统的人工智能辅助内镜系统通常依赖单模态图像分析,缺乏对上下文的理解以及对复杂胃肠道(GI)病症的适应性。此外,现有方法在领域转移、数据异质性和可解释性方面存在困难,限制了它们的临床适用性。
为应对这些挑战,我们提出了一个多模态学习框架,该框架将基于大语言模型的聊天机器人与内镜成像和患者特定的医疗数据相结合。我们的方法采用自监督学习从异质来源中提取临床相关模式,实现实时指导和人工智能辅助报告生成。我们引入了一种领域自适应学习策略,以增强模型在不同患者群体和成像条件下的泛化能力。
在多个胃肠道数据集上的实验结果表明,我们的方法显著提高了病变检测能力,减少了诊断变异性,并增强了医生与人工智能的协作。这项研究强调了基于多模态大语言模型的系统在消化内镜检查中提供可解释、上下文感知和适应性人工智能支持,从而推动胃肠病学发展方面的潜力。