Suppr超能文献

利用深度学习从有限的训练全切片图像和报告中获取生物医学知识的多模态表示。

Multimodal representations of biomedical knowledge from limited training whole slide images and reports using deep learning.

机构信息

Information Systems Institute, University of Applied Sciences Western Switzerland (HES-SO Valais), Sierre, Switzerland.

Department of Information Engineering, University of Padua, Padua, Italy.

出版信息

Med Image Anal. 2024 Oct;97:103303. doi: 10.1016/j.media.2024.103303. Epub 2024 Aug 14.

Abstract

The increasing availability of biomedical data creates valuable resources for developing new deep learning algorithms to support experts, especially in domains where collecting large volumes of annotated data is not trivial. Biomedical data include several modalities containing complementary information, such as medical images and reports: images are often large and encode low-level information, while reports include a summarized high-level description of the findings identified within data and often only concerning a small part of the image. However, only a few methods allow to effectively link the visual content of images with the textual content of reports, preventing medical specialists from properly benefitting from the recent opportunities offered by deep learning models. This paper introduces a multimodal architecture creating a robust biomedical data representation encoding fine-grained text representations within image embeddings. The architecture aims to tackle data scarcity (combining supervised and self-supervised learning) and to create multimodal biomedical ontologies. The architecture is trained on over 6,000 colon whole slide Images (WSI), paired with the corresponding report, collected from two digital pathology workflows. The evaluation of the multimodal architecture involves three tasks: WSI classification (on data from pathology workflow and from public repositories), multimodal data retrieval, and linking between textual and visual concepts. Noticeably, the latter two tasks are available by architectural design without further training, showing that the multimodal architecture that can be adopted as a backbone to solve peculiar tasks. The multimodal data representation outperforms the unimodal one on the classification of colon WSIs and allows to halve the data needed to reach accurate performance, reducing the computational power required and thus the carbon footprint. The combination of images and reports exploiting self-supervised algorithms allows to mine databases without needing new annotations provided by experts, extracting new information. In particular, the multimodal visual ontology, linking semantic concepts to images, may pave the way to advancements in medicine and biomedical analysis domains, not limited to histopathology.

摘要

生物医学数据的日益丰富为开发新的深度学习算法提供了有价值的资源,这些算法可以为专家提供支持,尤其是在需要收集大量注释数据的领域。生物医学数据包括几种模态,包含互补信息,例如医学图像和报告:图像通常较大,编码低级信息,而报告则包含对数据中发现的高级别描述的总结,并且通常只涉及图像的一小部分。然而,只有少数方法能够有效地将图像的视觉内容与报告的文本内容联系起来,从而使医学专家无法充分利用深度学习模型带来的最新机会。本文介绍了一种多模态架构,该架构通过在图像嵌入中创建精细的文本表示来构建强大的生物医学数据表示。该架构旨在解决数据稀缺性(结合监督学习和自监督学习)并创建多模态生物医学本体。该架构在超过 6000 张结肠全切片图像(WSI)上进行训练,这些图像与来自两个数字病理学工作流程的相应报告配对。多模态架构的评估涉及三个任务:WSI 分类(对来自病理学工作流程和公共存储库的数据进行分类)、多模态数据检索以及文本和视觉概念之间的链接。值得注意的是,后两个任务是通过架构设计提供的,无需进一步培训,这表明多模态架构可以作为解决特殊任务的骨干。多模态数据表示在结肠 WSI 的分类上优于单模态数据表示,并且可以将达到准确性能所需的数据减半,从而减少所需的计算能力并降低碳足迹。利用自监督算法结合图像和报告可以挖掘数据库,而无需专家提供新的注释,从而提取新的信息。特别是,将语义概念与图像联系起来的多模态视觉本体可能为医学和生物医学分析领域的发展铺平道路,而不仅仅局限于组织病理学。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验