Suppr超能文献

一种用于早期皮肤病检测的迭代混合多模态深度学习方法的设计,该方法采用交叉注意力和基于图的融合技术。

Design of an iterative hybrid multimodal deep learning method for early skin disease detection with cross-attention and graph-based fusions.

作者信息

Shivasree Yerrabati, RaviSankar V

机构信息

Department of Computer Science & Engineering, GITAM Deemed to be University, Hyderabad, Telangana, India.

出版信息

MethodsX. 2025 Aug 24;15:103584. doi: 10.1016/j.mex.2025.103584. eCollection 2025 Dec.

Abstract

This study proposes an end-to-end multimodal learning framework for early skin disease detection, incorporating both their spatial, temporal, and semantic information across heterogeneous patient data. The framework is composed of three key modules: (i)EfficientNet-B4 that extracts rich visual features from dermoscopic images, (ii) aBiLSTM enhanced with temporal attentionto model symptom evolvement from sensor-based time-series signals, and (iii) ClinicalBERT, a domain-specific transformer that generates contextual embeddings from patient clinical narratives. Modality-specific features are combined with a multi-head cross-attention mechanism to aggregate inter-dependency of input patterns and then fed into a Graph Attention Network (GAT) to capture inter-patient relationships according to feature affinity. This joint framework produces context-aware representations that can be used for classification. Experimental results show that the model can achieve predictive accuracy of 89.6 % and F1-score of 0.886 on average, which is superior to the state-of-the-art CNN-based baselines. Through simultaneously optimizing spatial detail, temporal dynamics, and clinical context, the Proposed SkinHarmoNet Model provides reliable and interpretable predictions, and its performance establishes the new state-of-the-art for multimodal dermatologic AI in a clinical setting.•Multimodal fusion: spatial, temporal, and semantic Modalities•Cross-attention and GAT: enhanced interaction of features•High performance: 89.6\ % accuracy, F1= 0.886.

摘要

本研究提出了一种用于早期皮肤疾病检测的端到端多模态学习框架,该框架整合了跨异构患者数据的空间、时间和语义信息。该框架由三个关键模块组成:(i)从皮肤镜图像中提取丰富视觉特征的EfficientNet-B4;(ii)通过时间注意力增强的双向长短期记忆网络(BiLSTM),用于对基于传感器的时间序列信号中的症状演变进行建模;(iii)ClinicalBERT,一种特定领域的变换器,用于从患者临床叙述中生成上下文嵌入。特定模态的特征通过多头交叉注意力机制进行组合,以聚合输入模式的相互依赖性,然后输入到图注意力网络(GAT)中,根据特征亲和力捕获患者间的关系。这个联合框架产生可用于分类的上下文感知表示。实验结果表明,该模型平均可实现89.6%的预测准确率和0.886的F1分数,优于基于卷积神经网络(CNN)的现有基线。通过同时优化空间细节、时间动态和临床上下文,所提出的SkinHarmoNet模型提供了可靠且可解释的预测,其性能在临床环境中为多模态皮肤病人工智能树立了新的先进水平。•多模态融合:空间、时间和语义模态•交叉注意力和GAT:增强特征交互•高性能:准确率89.6%,F1 = 0.886

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/12c6/12423411/0aa3bfb749d7/ga1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验