Caruso Camillo Maria, Soda Paolo, Guarrasi Valerio
Research Unit of Artificial Intelligence and Computer Systems, Department of Engineering, Università Campus Bio-Medico di Roma, Roma, Italy.
Research Unit of Artificial Intelligence and Computer Systems, Department of Engineering, Università Campus Bio-Medico di Roma, Roma, Italy; Department of Diagnostics and Intervention, Radiation Physics, Biomedical Engineering, Umeå University, Umeå, Sweden.
Comput Biol Med. 2025 Sep;196(Pt C):110843. doi: 10.1016/j.compbiomed.2025.110843. Epub 2025 Aug 9.
In healthcare, the integration of multimodal data is pivotal for developing comprehensive diagnostic and predictive models. However, managing missing data remains a significant challenge in real-world applications. We introduce MARIA (Multimodal Attention Resilient to Incomplete datA), a novel transformer-based deep learning model designed to address these challenges through an intermediate fusion strategy. Unlike conventional approaches that depend on imputation, MARIA utilizes a modified masked self-attention mechanism, which processes only the available data without generating synthetic values. This approach enables it to effectively handle incomplete datasets, enhancing robustness and minimizing biases introduced by imputation methods. We evaluated MARIA against 10 state-of-the-art machine learning and deep learning models across 8 diagnostic and prognostic tasks. The results demonstrate that MARIA outperforms existing methods in terms of performance and resilience to varying levels of data incompleteness, underscoring its potential for critical healthcare applications. To support transparency and encourage further research, the source code is openly available at https://github.com/cosbidev/MARIA.
在医疗保健领域,多模态数据的整合对于开发全面的诊断和预测模型至关重要。然而,在实际应用中,管理缺失数据仍然是一项重大挑战。我们引入了MARIA(对不完整数据具有弹性的多模态注意力模型),这是一种基于新型Transformer的深度学习模型,旨在通过中间融合策略来应对这些挑战。与依赖插补的传统方法不同,MARIA采用了改进的掩码自注意力机制,该机制仅处理可用数据,而不生成合成值。这种方法使其能够有效地处理不完整的数据集,增强鲁棒性并最大限度地减少插补方法引入的偏差。我们在8个诊断和预后任务中,将MARIA与10种先进的机器学习和深度学习模型进行了评估对比。结果表明,在性能以及对不同程度数据不完整性的适应能力方面,MARIA均优于现有方法,这突出了其在关键医疗保健应用中的潜力。为了支持透明度并鼓励进一步研究,源代码可在https://github.com/cosbidev/MARIA上公开获取。