Suppr超能文献

利用大语言模型和视觉模型从大规模图像文本结肠镜检查记录中提取知识。

Leveraging large language and vision models for knowledge extraction from large-scale image-text colonoscopy records.

作者信息

Wang Shuo, Zhu Yan, Yang Zhiwei, Luo Xiaoyuan, Zhang Yizhe, Fu Peiyao, Wang Haoran, Wang Manning, Song Zhijian, Li Quanlin, Zhou Pinghong, Guo Yike

机构信息

Digital Medical Research Centre, School of Basic Medical Sciences, Fudan University, Shanghai, China.

Shanghai Key Laboratory of MICCAI, Shanghai, China.

出版信息

Nat Biomed Eng. 2025 Sep 16. doi: 10.1038/s41551-025-01500-x.

Abstract

The development of artificial intelligence systems for colonoscopy analysis often necessitates expert-annotated image datasets. However, limitations in dataset size and diversity impede model performance and generalization. Image-text colonoscopy records from routine clinical practice, comprising millions of images and text reports, serve as a valuable data source, although annotating them is labour intensive. Here we leverage recent advancements in large language and vision models and propose EndoKED, a data mining paradigm for deep knowledge extraction and distillation. EndoKED automates the transformation of raw colonoscopy records into image datasets with pixel-level annotation. We apply EndoKED to multicentre datasets of raw colonoscopy records (~1 million images), showing its superior performance in detecting polyps at the report and image levels, as well as annotating polyps at the pixel level. The state-of-the-art performance and generalization ability of polyp segmentation models are achieved through EndoKED pretraining. Furthermore, the EndoKED vision backbone enables data-efficient learning for optical biopsy, achieving expert-level performance in internal, external and prospective validation datasets.

摘要

用于结肠镜检查分析的人工智能系统的开发通常需要有专家标注的图像数据集。然而,数据集规模和多样性方面的限制阻碍了模型性能和泛化能力。来自常规临床实践的图像-文本结肠镜检查记录包含数百万张图像和文本报告,是一个有价值的数据源,尽管对其进行标注需要耗费大量人力。在此,我们利用大语言和视觉模型的最新进展,提出了EndoKED,一种用于深度知识提取和提炼的数据挖掘范式。EndoKED能自动将原始结肠镜检查记录转化为具有像素级标注的图像数据集。我们将EndoKED应用于原始结肠镜检查记录的多中心数据集(约100万张图像),展示了其在报告和图像层面检测息肉以及在像素层面标注息肉方面的卓越性能。通过EndoKED预训练实现了息肉分割模型的先进性能和泛化能力。此外,EndoKED视觉主干能够实现光学活检的数据高效学习,在内部、外部和前瞻性验证数据集中达到专家级性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验