• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

结合真实数据与合成数据以克服多模态学习中有限的训练数据集

Combining Real and Synthetic Data to Overcome Limited Training Datasets in Multimodal Learning.

作者信息

Marini Niccolo, Liang Zhaohui, Rajaraman Sivaramakrishnan, Xue Zhiyun, Antani Sameer

机构信息

Division of Intramural Research, National Library of Medicine, National Institutes of Health Bethesda, MD, 290894, USA.

出版信息

medRxiv. 2025 Jul 17:2025.07.16.25331662. doi: 10.1101/2025.07.16.25331662.

DOI:10.1101/2025.07.16.25331662
PMID:40791679
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12338939/
Abstract

Biomedical data are inherently multimodal, capturing complementary aspects of a patient condition. Deep learning (DL) algorithms that integrate multiple biomedical modalities can significantly improve clinical decision-making, especially in domains where collecting data is not simple and data are highly heterogeneous. However, developing effective and reliable multimodal DL methods remains challenging, requiring large training datasets with paired samples from modalities of interest. An increasing number of de-identifed biomedical datasets are publicly accessible, though they still tend to be unimodal. For example, several publicly available skin lesion datasets aid automated dermatology clinical decision-making. Still, they lack annotated reports paired with the images, thereby limiting the advance and use of multimodal DL algorithms. This work presents a strategy exploiting real and synthesized data in a multimodal architecture that encodes fine-grained text representations within image embeddings to create a robust representation of skin lesion data. Large language models (LLMs) are used to synthesize textual descriptions from image metadata that are subsequently paired with the original skin lesion images and used for model development. The architecture is evaluated on the classification of skin lesion images, considering nine internal and external data sources. The proposed multimodal representation outperforms the unimodal one on the classification of skin lesion images, achieving superior performance in every tested dataset.

摘要

生物医学数据本质上是多模态的,能够捕捉患者病情的互补方面。整合多种生物医学模态的深度学习(DL)算法可以显著改善临床决策,尤其是在数据收集不简单且数据高度异质的领域。然而,开发有效且可靠的多模态DL方法仍然具有挑战性,需要来自感兴趣模态的带有配对样本的大型训练数据集。越来越多的去识别生物医学数据集可公开获取,不过它们往往仍是单模态的。例如,几个公开可用的皮肤病变数据集有助于皮肤病学临床决策自动化。然而,它们缺乏与图像配对的注释报告,从而限制了多模态DL算法的发展和应用。这项工作提出了一种在多模态架构中利用真实数据和合成数据的策略,该架构在图像嵌入中编码细粒度文本表示,以创建皮肤病变数据的强大表示。大语言模型(LLMs)用于从图像元数据中合成文本描述,这些文本描述随后与原始皮肤病变图像配对并用于模型开发。该架构在考虑九个内部和外部数据源的情况下,对皮肤病变图像分类进行了评估。所提出的多模态表示在皮肤病变图像分类方面优于单模态表示,在每个测试数据集中都取得了卓越的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3c8/12338939/ac4e90ed96aa/nihpp-2025.07.16.25331662v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3c8/12338939/ac4e90ed96aa/nihpp-2025.07.16.25331662v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3c8/12338939/ac4e90ed96aa/nihpp-2025.07.16.25331662v1-f0001.jpg

相似文献

1
Combining Real and Synthetic Data to Overcome Limited Training Datasets in Multimodal Learning.结合真实数据与合成数据以克服多模态学习中有限的训练数据集
medRxiv. 2025 Jul 17:2025.07.16.25331662. doi: 10.1101/2025.07.16.25331662.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Sexual Harassment and Prevention Training性骚扰与预防培训
4
Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.利用基础模型库进行跨设备肿瘤显微镜检查中的细胞相似性搜索。
Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.
5
Short-Term Memory Impairment短期记忆障碍
6
A deep learning approach to direct immunofluorescence pattern recognition in autoimmune bullous diseases.深度学习方法在自身免疫性大疱性疾病中的直接免疫荧光模式识别。
Br J Dermatol. 2024 Jul 16;191(2):261-266. doi: 10.1093/bjd/ljae142.
7
Fine-grained Prototype Network for MRI Sequence Classification.用于MRI序列分类的细粒度原型网络。
Curr Med Imaging. 2025 Jul 30. doi: 10.2174/0115734056361649250717162910.
8
Cognitive decline assessment using semantic linguistic content and transformer deep learning architecture.使用语义语言内容和变压器深度学习架构评估认知能力下降。
Int J Lang Commun Disord. 2024 May-Jun;59(3):1110-1127. doi: 10.1111/1460-6984.12973. Epub 2023 Nov 16.
9
BioBLP: a modular framework for learning on multimodal biomedical knowledge graphs.BioBLP:一种用于多模态生物医学知识图谱学习的模块化框架。
J Biomed Semantics. 2023 Dec 8;14(1):20. doi: 10.1186/s13326-023-00301-y.
10
Noninvasive Multimodal Imaging and Its Role in Diagnosing Skin Lesions in Dermatology: A Systematic Review and Meta-Analysis.非侵入性多模态成像及其在皮肤科皮肤病变诊断中的作用:一项系统评价和荟萃分析。
Am J Clin Dermatol. 2025 Jul 8. doi: 10.1007/s40257-025-00958-4.

本文引用的文献

1
DERM12345: A Large, Multisource Dermatoscopic Skin Lesion Dataset with 40 Subclasses.DERM12345:一个大型的、多数据源的皮肤科病变数据集,包含 40 个子类别。
Sci Data. 2024 Nov 28;11(1):1302. doi: 10.1038/s41597-024-04104-3.
2
Foundation Model for Advancing Healthcare: Challenges, Opportunities and Future Directions.推进医疗保健的基础模型:挑战、机遇与未来方向。
IEEE Rev Biomed Eng. 2025;18:172-191. doi: 10.1109/RBME.2024.3496744. Epub 2025 Jan 28.
3
Multimodal Large Language Models in Health Care: Applications, Challenges, and Future Outlook.
医疗保健中的多模态大型语言模型:应用、挑战和未来展望。
J Med Internet Res. 2024 Sep 25;26:e59505. doi: 10.2196/59505.
4
Multimodal representations of biomedical knowledge from limited training whole slide images and reports using deep learning.利用深度学习从有限的训练全切片图像和报告中获取生物医学知识的多模态表示。
Med Image Anal. 2024 Oct;97:103303. doi: 10.1016/j.media.2024.103303. Epub 2024 Aug 14.
5
Pre-trained multimodal large language model enhances dermatological diagnosis using SkinGPT-4.预训练多模态大型语言模型通过使用 SkinGPT-4 增强皮肤科诊断。
Nat Commun. 2024 Jul 5;15(1):5649. doi: 10.1038/s41467-024-50043-3.
6
BCN20000: Dermoscopic Lesions in the Wild.BCN20000:野外的皮肤镜病变。
Sci Data. 2024 Jun 17;11(1):641. doi: 10.1038/s41597-024-03387-w.
7
A dataset of skin lesion images collected in Argentina for the evaluation of AI tools in this population.一个在阿根廷收集的皮肤损伤图像数据集,用于评估该人群中的人工智能工具。
Sci Data. 2023 Oct 18;10(1):712. doi: 10.1038/s41597-023-02630-0.
8
A scoping review on multimodal deep learning in biomedical images and texts.多模态深度学习在生物医学图像和文本中的应用综述
J Biomed Inform. 2023 Oct;146:104482. doi: 10.1016/j.jbi.2023.104482. Epub 2023 Aug 29.
9
A Multimodal Transformer: Fusing Clinical Notes with Structured EHR Data for Interpretable In-Hospital Mortality Prediction.多模态 Transformer:融合临床笔记与结构化电子健康记录数据以实现可解释的住院死亡率预测。
AMIA Annu Symp Proc. 2023 Apr 29;2022:719-728. eCollection 2022.
10
Foundation models for generalist medical artificial intelligence.通用型医学人工智能的基础模型。
Nature. 2023 Apr;616(7956):259-265. doi: 10.1038/s41586-023-05881-4. Epub 2023 Apr 12.