• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于注意力的多模态融合与对比,用于在模态缺失的情况下进行稳健的临床预测。

Attention-based multimodal fusion with contrast for robust clinical prediction in the face of missing modalities.

机构信息

Australian e-Health Research Centre, CSIRO, Queensland, Australia; School of Computing and Information Systems, The University of Melbourne, Victoria, Australia.

School of Computing and Information Systems, The University of Melbourne, Victoria, Australia; Centre for Digital Transformation of Health, The University of Melbourne, Victoria, Australia.

出版信息

J Biomed Inform. 2023 Sep;145:104466. doi: 10.1016/j.jbi.2023.104466. Epub 2023 Aug 5.

DOI:10.1016/j.jbi.2023.104466
PMID:37549722
Abstract

OBJECTIVE

With the increasing amount and growing variety of healthcare data, multimodal machine learning supporting integrated modeling of structured and unstructured data is an increasingly important tool for clinical machine learning tasks. However, it is non-trivial to manage the differences in dimensionality, volume, and temporal characteristics of data modalities in the context of a shared target task. Furthermore, patients can have substantial variations in the availability of data, while existing multimodal modeling methods typically assume data completeness and lack a mechanism to handle missing modalities.

METHODS

We propose a Transformer-based fusion model with modality-specific tokens that summarize the corresponding modalities to achieve effective cross-modal interaction accommodating missing modalities in the clinical context. The model is further refined by inter-modal, inter-sample contrastive learning to improve the representations for better predictive performance. We denote the model as Attention-based cRoss-MOdal fUsion with contRast (ARMOUR). We evaluate ARMOUR using two input modalities (structured measurements and unstructured text), six clinical prediction tasks, and two evaluation regimes, either including or excluding samples with missing modalities.

RESULTS

Our model shows improved performances over unimodal or multimodal baselines in both evaluation regimes, including or excluding patients with missing modalities in the input. The contrastive learning improves the representation power and is shown to be essential for better results. The simple setup of modality-specific tokens enables ARMOUR to handle patients with missing modalities and allows comparison with existing unimodal benchmark results.

CONCLUSION

We propose a multimodal model for robust clinical prediction to achieve improved performance while accommodating patients with missing modalities. This work could inspire future research to study the effective incorporation of multiple, more complex modalities of clinical data into a single model.

摘要

目的

随着医疗保健数据量的增加和种类的增多,支持对结构化和非结构化数据进行集成建模的多模态机器学习是临床机器学习任务中越来越重要的工具。然而,在共享目标任务的背景下,管理数据模态在维度、数量和时间特征方面的差异并非易事。此外,患者在数据可用性方面可能存在很大差异,而现有的多模态建模方法通常假设数据是完整的,并且缺乏处理缺失模态的机制。

方法

我们提出了一种基于 Transformer 的融合模型,该模型具有特定于模态的令牌,可以总结相应的模态,以实现有效的跨模态交互,适应临床环境中缺失的模态。通过模态间、样本间对比学习进一步细化模型,以提高表示能力,从而提高预测性能。我们将该模型命名为基于注意力的跨模态融合与对比(ARMOUR)。我们使用两种输入模态(结构化测量和非结构化文本)、六个临床预测任务和两种评估方案(包括或不包括输入中缺失模态的样本)来评估 ARMOUR。

结果

我们的模型在包括或不包括输入中缺失模态的样本的两种评估方案中,均优于单模态或多模态基线模型,表现出更好的性能。对比学习提高了表示能力,对于获得更好的结果是必不可少的。模态特定令牌的简单设置使 ARMOUR 能够处理缺失模态的患者,并允许与现有的单模态基准结果进行比较。

结论

我们提出了一种用于稳健临床预测的多模态模型,以在适应缺失模态患者的同时提高性能。这项工作可以激发未来的研究,研究如何将多种更复杂的临床数据模式有效地纳入单个模型中。

相似文献

1
Attention-based multimodal fusion with contrast for robust clinical prediction in the face of missing modalities.基于注意力的多模态融合与对比,用于在模态缺失的情况下进行稳健的临床预测。
J Biomed Inform. 2023 Sep;145:104466. doi: 10.1016/j.jbi.2023.104466. Epub 2023 Aug 5.
2
Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture Recognition.对齐增强型交互式融合模型用于完整和不完整多模态手势识别。
IEEE Trans Neural Syst Rehabil Eng. 2023;31:4661-4671. doi: 10.1109/TNSRE.2023.3335101. Epub 2023 Nov 30.
3
COM: Contrastive Masked-attention model for incomplete multimodal learning.COM:用于不完全多模态学习的对比掩蔽注意力模型。
Neural Netw. 2023 May;162:443-455. doi: 10.1016/j.neunet.2023.03.003. Epub 2023 Mar 5.
4
Multimodal Sentiment Analysis Based on Cross-Modal Attention and Gated Cyclic Hierarchical Fusion Networks.基于跨模态注意力和门控循环层次融合网络的多模态情感分析。
Comput Intell Neurosci. 2022 Aug 9;2022:4767437. doi: 10.1155/2022/4767437. eCollection 2022.
5
A unified multimodal classification framework based on deep metric learning.基于深度度量学习的统一多模态分类框架。
Neural Netw. 2025 Jan;181:106747. doi: 10.1016/j.neunet.2024.106747. Epub 2024 Oct 4.
6
Joint learning-based feature reconstruction and enhanced network for incomplete multi-modal brain tumor segmentation.基于联合学习的特征重构和增强网络用于不完全多模态脑肿瘤分割。
Comput Biol Med. 2023 Sep;163:107234. doi: 10.1016/j.compbiomed.2023.107234. Epub 2023 Jul 4.
7
Multimodal learning for fetal distress diagnosis using a multimodal medical information fusion framework.使用多模态医学信息融合框架进行胎儿窘迫诊断的多模态学习
Front Physiol. 2022 Nov 7;13:1021400. doi: 10.3389/fphys.2022.1021400. eCollection 2022.
8
Progressive Learning of a Multimodal Classifier Accounting for Different Modality Combinations.渐进式学习多模态分类器,考虑不同模态组合。
Sensors (Basel). 2023 May 11;23(10):4666. doi: 10.3390/s23104666.
9
Multimodal MRI radiomic models to predict genomic mutations in diffuse intrinsic pontine glioma with missing imaging modalities.用于预测弥漫性脑桥内在型胶质瘤基因组突变的多模态MRI影像组学模型,其中存在缺失的成像模态。
Front Med (Lausanne). 2023 Feb 23;10:1071447. doi: 10.3389/fmed.2023.1071447. eCollection 2023.
10
Toward attention-based learning to predict the risk of brain degeneration with multimodal medical data.迈向基于注意力的学习,以利用多模态医学数据预测脑退化风险。
Front Neurosci. 2023 Jan 18;16:1043626. doi: 10.3389/fnins.2022.1043626. eCollection 2022.

引用本文的文献

1
AI-driven multimodal colorimetric analytics for biomedical and behavioral health diagnostics.用于生物医学和行为健康诊断的人工智能驱动的多模态比色分析
Comput Struct Biotechnol J. 2025 May 28;27:2219-2232. doi: 10.1016/j.csbj.2025.05.015. eCollection 2025.
2
Navigating the Multiverse: a Hitchhiker's guide to selecting harmonization methods for multimodal biomedical data.探索多元宇宙:多模态生物医学数据协调方法选择指南
Biol Methods Protoc. 2025 Apr 17;10(1):bpaf028. doi: 10.1093/biomethods/bpaf028. eCollection 2025.
3
Distilling the knowledge from large-language model for health event prediction.
从大语言模型中提取知识用于健康事件预测。
Sci Rep. 2024 Dec 28;14(1):30675. doi: 10.1038/s41598-024-75331-2.
4
Integrating artificial intelligence with smartphone-based imaging for cancer detection in vivo.将人工智能与基于智能手机的成像技术相结合用于体内癌症检测。
Biosens Bioelectron. 2025 Mar 1;271:116982. doi: 10.1016/j.bios.2024.116982. Epub 2024 Nov 21.
5
Clinical natural language processing for secondary uses.用于二次利用的临床自然语言处理。
J Biomed Inform. 2024 Feb;150:104596. doi: 10.1016/j.jbi.2024.104596. Epub 2024 Jan 24.