• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ILDIM-MFAM:具有多模态融合注意力机制的间质性肺疾病识别模型。

ILDIM-MFAM: interstitial lung disease identification model with multi-modal fusion attention mechanism.

作者信息

Zhong Bin, Zhang Runan, Luo Shuixiang, Zheng Jie

机构信息

Department of Respiratory Medicine, The First Affiliated Hospital of Gannan Medical University, Ganzhou, Jiangxi, China.

College of Pharmacy, Gannan Medical University, Ganzhou, Jiangxi, China.

出版信息

Front Med (Lausanne). 2024 Nov 18;11:1446936. doi: 10.3389/fmed.2024.1446936. eCollection 2024.

DOI:10.3389/fmed.2024.1446936
PMID:39624040
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11609931/
Abstract

This study aims to address the potential and challenges of multimodal medical information in the diagnosis of interstitial lung disease (ILD) by developing an ILD identification model (ILDIM) based on the multimodal fusion attention mechanism (MFAM) to improve the accuracy and reliability of ILD. Large-scale multimodal medical information data, including chest CT image slices, physiological indicator time series data, and patient history text information were collected. These data are professionally cleaned and normalized to ensure data quality and consistency. Convolutional Neural Network (CNN) is used to extract CT image features, Bidirectional Long Short-Term Memory Network (Bi-LSTM) model is used to learn temporal physiological metrics data under long-term dependency, and Self-Attention Mechanism is used to encode textual semantic information in patient's self-reporting and medical prescriptions. In addition, the multimodal perception mechanism uses a Transformer-based model to improve the diagnostic performance of ILD by learning the importance weights of each modality's data to optimally fuse the different modalities. Finally, the ablation test and comparison results show that the model performs well in terms of comprehensive performance. By combining multimodal data sources, the model not only improved the Precision, Recall and F1 score, but also significantly increased the AUC value. This suggests that the combined use of different modal information can provide a more comprehensive assessment of a patient's health status, thereby improving the diagnostic comprehensiveness and accuracy of ILD. This study also considered the computational complexity of the model, and the results show that ILDIM-MFAM has a relatively low number of model parameters and computational complexity, which is very favorable for practical deployment and operational efficiency.

摘要

本研究旨在通过开发基于多模态融合注意力机制(MFAM)的间质性肺疾病(ILD)识别模型(ILDIM)来解决多模态医学信息在ILD诊断中的潜力和挑战,以提高ILD诊断的准确性和可靠性。收集了大规模的多模态医学信息数据,包括胸部CT图像切片、生理指标时间序列数据和患者病史文本信息。这些数据经过专业清理和归一化处理,以确保数据质量和一致性。使用卷积神经网络(CNN)提取CT图像特征,使用双向长短期记忆网络(Bi-LSTM)模型学习长期依赖下的时间生理指标数据,并使用自注意力机制对患者自述和医学处方中的文本语义信息进行编码。此外,多模态感知机制使用基于Transformer的模型,通过学习各模态数据的重要性权重来优化融合不同模态,从而提高ILD的诊断性能。最后,消融测试和比较结果表明,该模型在综合性能方面表现良好。通过结合多模态数据源,该模型不仅提高了精确率、召回率和F1分数,还显著提高了AUC值。这表明不同模态信息的联合使用可以更全面地评估患者的健康状况,从而提高ILD诊断的全面性和准确性。本研究还考虑了模型的计算复杂度,结果表明ILDIM-MFAM的模型参数数量和计算复杂度相对较低,这对实际部署和运行效率非常有利。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7411/11609931/8c7447dab738/fmed-11-1446936-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7411/11609931/ebc5141cc6c7/fmed-11-1446936-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7411/11609931/dbd152e7f5f8/fmed-11-1446936-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7411/11609931/e47c43fb9636/fmed-11-1446936-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7411/11609931/8b18ed8b4a19/fmed-11-1446936-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7411/11609931/dfaecb8917f5/fmed-11-1446936-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7411/11609931/8e747a6e23e9/fmed-11-1446936-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7411/11609931/8c7447dab738/fmed-11-1446936-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7411/11609931/ebc5141cc6c7/fmed-11-1446936-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7411/11609931/dbd152e7f5f8/fmed-11-1446936-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7411/11609931/e47c43fb9636/fmed-11-1446936-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7411/11609931/8b18ed8b4a19/fmed-11-1446936-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7411/11609931/dfaecb8917f5/fmed-11-1446936-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7411/11609931/8e747a6e23e9/fmed-11-1446936-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7411/11609931/8c7447dab738/fmed-11-1446936-g007.jpg

相似文献

1
ILDIM-MFAM: interstitial lung disease identification model with multi-modal fusion attention mechanism.ILDIM-MFAM:具有多模态融合注意力机制的间质性肺疾病识别模型。
Front Med (Lausanne). 2024 Nov 18;11:1446936. doi: 10.3389/fmed.2024.1446936. eCollection 2024.
2
MFAM-AD: an anomaly detection model for multivariate time series using attention mechanism to fuse multi-scale features.MFAM-AD:一种用于多元时间序列的异常检测模型,利用注意力机制融合多尺度特征。
PeerJ Comput Sci. 2024 Aug 30;10:e2201. doi: 10.7717/peerj-cs.2201. eCollection 2024.
3
MMAgentRec, a personalized multi-modal recommendation agent with large language model.MMAgentRec,一个带有大语言模型的个性化多模态推荐代理。
Sci Rep. 2025 Apr 8;15(1):12062. doi: 10.1038/s41598-025-96458-w.
4
Multi-modal fusion model for Time-Varying medical Data: Addressing Long-Term dependencies and memory challenges in sequence fusion.用于时变医学数据的多模态融合模型:解决序列融合中的长期依赖性和内存挑战
J Biomed Inform. 2025 May;165:104823. doi: 10.1016/j.jbi.2025.104823. Epub 2025 Apr 4.
5
A comparative study on deep learning models for text classification of unstructured medical notes with various levels of class imbalance.深度学习模型在不同类别不平衡程度的非结构化医疗记录文本分类中的对比研究。
BMC Med Res Methodol. 2022 Jul 2;22(1):181. doi: 10.1186/s12874-022-01665-y.
6
Towards robust multimodal ultrasound classification for liver tumor diagnosis: A generative approach to modality missingness.迈向用于肝肿瘤诊断的稳健多模态超声分类:一种处理模态缺失的生成方法。
Comput Methods Programs Biomed. 2025 Jun;265:108759. doi: 10.1016/j.cmpb.2025.108759. Epub 2025 Mar 30.
7
CCGL-YOLOV5:A cross-modal cross-scale global-local attention YOLOV5 lung tumor detection model.CCGL-YOLOV5:一种跨模态跨尺度全局-局部注意力 YOLOV5 肺肿瘤检测模型。
Comput Biol Med. 2023 Oct;165:107387. doi: 10.1016/j.compbiomed.2023.107387. Epub 2023 Aug 28.
8
BAF-Net: bidirectional attention-aware fluid pyramid feature integrated multimodal fusion network for diagnosis and prognosis.BAF-Net:双向注意感知的流体金字塔特征集成多模态融合网络,用于诊断和预后。
Phys Med Biol. 2024 Apr 29;69(10). doi: 10.1088/1361-6560/ad3cb2.
9
Transferable non-invasive modal fusion-transformer (NIMFT) for end-to-end hand gesture recognition.可迁移的无创模态融合-Transformer(NIMFT)用于端到端手势识别。
J Neural Eng. 2024 Apr 9;21(2). doi: 10.1088/1741-2552/ad39a5.
10
Translating medical image to radiological report: Adaptive multilevel multi-attention approach.将医学图像翻译为放射报告:自适应多级多关注方法。
Comput Methods Programs Biomed. 2022 Jun;221:106853. doi: 10.1016/j.cmpb.2022.106853. Epub 2022 May 4.

本文引用的文献

1
A New Brain Network Construction Paradigm for Brain Disorder via Diffusion-Based Graph Contrastive Learning.基于扩散的图对比学习的脑疾病新脑网络构建范式。
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):10389-10403. doi: 10.1109/TPAMI.2024.3442811. Epub 2024 Nov 6.
2
Prior-Guided Adversarial Learning With Hypergraph for Predicting Abnormal Connections in Alzheimer's Disease.基于超图的先验引导对抗学习用于预测阿尔茨海默病中的异常连接
IEEE Trans Cybern. 2024 Jun;54(6):3652-3665. doi: 10.1109/TCYB.2023.3344641. Epub 2024 May 30.
3
Identifying the link between serum VEGF and KL-6 concentrations: a correlation analysis for idiopathic pulmonary fibrosis interstitial lung disease progression.
确定血清血管内皮生长因子(VEGF)与KL-6浓度之间的联系:特发性肺纤维化间质性肺病进展的相关性分析
Front Med (Lausanne). 2023 Dec 1;10:1282757. doi: 10.3389/fmed.2023.1282757. eCollection 2023.
4
Brain Structure-Function Fusing Representation Learning Using Adversarial Decomposed-VAE for Analyzing MCI.使用对抗分解 VAE 融合脑结构-功能表示学习分析 MCI。
IEEE Trans Neural Syst Rehabil Eng. 2023;31:4017-4028. doi: 10.1109/TNSRE.2023.3323432. Epub 2023 Oct 18.
5
A transformer-based representation-learning model with unified processing of multimodal input for clinical diagnostics.一种基于变压器的表示学习模型,可统一处理临床诊断的多模态输入。
Nat Biomed Eng. 2023 Jun;7(6):743-755. doi: 10.1038/s41551-023-01045-x. Epub 2023 Jun 12.
6
Interstitial lung disease diagnosis and prognosis using an AI system integrating longitudinal data.使用集成纵向数据的人工智能系统诊断和预测间质性肺病
Nat Commun. 2023 Apr 20;14(1):2272. doi: 10.1038/s41467-023-37720-5.
7
Rolling Bearing Fault Diagnosis Using Hybrid Neural Network with Principal Component Analysis.基于主成分分析的混合神经网络在滚动轴承故障诊断中的应用。
Sensors (Basel). 2022 Nov 17;22(22):8906. doi: 10.3390/s22228906.
8
Nontuberculous Mycobacteria Lung Disease (NTM-LD): Current Recommendations on Diagnosis, Treatment, and Patient Management.非结核分枝杆菌肺病(NTM-LD):关于诊断、治疗和患者管理的当前建议
Int J Gen Med. 2022 Oct 1;15:7619-7629. doi: 10.2147/IJGM.S272690. eCollection 2022.
9
A novel multimodal fusion framework for early diagnosis and accurate classification of COVID-19 patients using X-ray images and speech signal processing techniques.一种使用 X 射线图像和语音信号处理技术对 COVID-19 患者进行早期诊断和准确分类的新型多模态融合框架。
Comput Methods Programs Biomed. 2022 Nov;226:107109. doi: 10.1016/j.cmpb.2022.107109. Epub 2022 Sep 12.
10
A Multimodal Prediction Model for Diagnosing Pulmonary Hypertension in Systemic Sclerosis.一种用于系统性硬化症中肺动脉高压诊断的多模态预测模型。
Arthritis Care Res (Hoboken). 2023 Jul;75(7):1462-1468. doi: 10.1002/acr.24969. Epub 2023 Jan 18.