Suppr超能文献

多模态数据融合的有效技术:比较分析。

Effective Techniques for Multimodal Data Fusion: A Comparative Analysis.

机构信息

Faculty of Mathematics and Information Science, Warsaw University of Technology, Koszykowa Street 75, 00-662 Warsaw, Poland.

WeSub, Adama Branickiego Street 17, 02-972 Warsaw, Poland.

出版信息

Sensors (Basel). 2023 Feb 21;23(5):2381. doi: 10.3390/s23052381.

Abstract

Data processing in robotics is currently challenged by the effective building of multimodal and common representations. Tremendous volumes of raw data are available and their smart management is the core concept of multimodal learning in a new paradigm for data fusion. Although several techniques for building multimodal representations have been proven successful, they have not yet been analyzed and compared in a given production setting. This paper explored three of the most common techniques, (1) the late fusion, (2) the early fusion, and (3) the sketch, and compared them in classification tasks. Our paper explored different types of data (modalities) that could be gathered by sensors serving a wide range of sensor applications. Our experiments were conducted on Amazon Reviews, MovieLens25M, and Movie-Lens1M datasets. Their outcomes allowed us to confirm that the choice of fusion technique for building multimodal representation is crucial to obtain the highest possible model performance resulting from the proper modality combination. Consequently, we designed criteria for choosing this optimal data fusion technique.

摘要

机器人的数据处理目前面临着有效构建多模态和通用表示的挑战。大量的原始数据可用,其智能管理是数据融合新范例中多模态学习的核心概念。尽管已经证明了几种构建多模态表示的技术是成功的,但它们尚未在给定的生产环境中进行分析和比较。本文探讨了三种最常见的技术,(1)晚期融合,(2)早期融合,和(3)草图,并在分类任务中对它们进行了比较。我们的论文探讨了可以由服务于各种传感器应用的传感器收集的不同类型的数据(模态)。我们的实验是在亚马逊评论、MovieLens25M 和 Movie-Lens1M 数据集上进行的。它们的结果使我们能够确认,选择融合技术来构建多模态表示对于获得最佳的模型性能是至关重要的,这是来自于适当的模态组合。因此,我们设计了选择这种最优数据融合技术的标准。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f2b1/10007548/e5f14be28ec5/sensors-23-02381-g001.jpg

相似文献

1
Effective Techniques for Multimodal Data Fusion: A Comparative Analysis.
Sensors (Basel). 2023 Feb 21;23(5):2381. doi: 10.3390/s23052381.
2
Development of Multimodal Fusion Technology for Tomato Maturity Assessment.
Sensors (Basel). 2024 Apr 11;24(8):2467. doi: 10.3390/s24082467.
3
Artificial intelligence-based methods for fusion of electronic health records and imaging data.
Sci Rep. 2022 Oct 26;12(1):17981. doi: 10.1038/s41598-022-22514-4.
4
Multimodal information bottleneck for deep reinforcement learning with multiple sensors.
Neural Netw. 2024 Aug;176:106347. doi: 10.1016/j.neunet.2024.106347. Epub 2024 Apr 27.
5
Reducing Annotation Burden Through Multimodal Learning.
Front Big Data. 2020 Jun 2;3:19. doi: 10.3389/fdata.2020.00019. eCollection 2020.
6
Sensor-Fusion for Smartphone Location Tracking Using Hybrid Multimodal Deep Neural Networks.
Sensors (Basel). 2021 Nov 11;21(22):7488. doi: 10.3390/s21227488.
7
MolPROP: Molecular Property prediction with multimodal language and graph fusion.
J Cheminform. 2024 May 22;16(1):56. doi: 10.1186/s13321-024-00846-9.
8
A review of deep learning-based information fusion techniques for multimodal medical image classification.
Comput Biol Med. 2024 Jul;177:108635. doi: 10.1016/j.compbiomed.2024.108635. Epub 2024 May 22.
9
End-to-end multimodal clinical depression recognition using deep neural networks: A comparative analysis.
Comput Methods Programs Biomed. 2021 Nov;211:106433. doi: 10.1016/j.cmpb.2021.106433. Epub 2021 Sep 28.
10
Multimodal Sentiment Analysis Based on Cross-Modal Attention and Gated Cyclic Hierarchical Fusion Networks.
Comput Intell Neurosci. 2022 Aug 9;2022:4767437. doi: 10.1155/2022/4767437. eCollection 2022.

引用本文的文献

1
Intelligent sensing devices and systems for personalized mental health.
Med X. 2025 Dec;3(1). doi: 10.1007/s44258-025-00057-3. Epub 2025 Apr 2.
3
Multimodal fusion with relational learning for molecular property prediction.
Commun Chem. 2025 Jul 5;8(1):200. doi: 10.1038/s42004-025-01586-z.
4
GPS: Harnessing data fusion strategies to improve the accuracy of machine learning-based genomic and phenotypic selection.
Plant Commun. 2025 Aug 11;6(8):101416. doi: 10.1016/j.xplc.2025.101416. Epub 2025 Jun 11.
5
Multimodal malware classification using proposed ensemble deep neural network framework.
Sci Rep. 2025 May 23;15(1):18006. doi: 10.1038/s41598-025-96203-3.
6
Recent Advances in Vehicle Driver Health Monitoring Systems.
Sensors (Basel). 2025 Mar 14;25(6):1812. doi: 10.3390/s25061812.
7
A multi-modal deep learning solution for precise pneumonia diagnosis: the PneumoFusion-Net model.
Front Physiol. 2025 Mar 12;16:1512835. doi: 10.3389/fphys.2025.1512835. eCollection 2025.

本文引用的文献

1
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning.
Adv Neural Inf Process Syst. 2021 Dec;2021(DB1):1-20.
2
A Review of Multisensor Data Fusion Solutions in Smart Manufacturing: Systems and Trends.
Sensors (Basel). 2022 Feb 23;22(5):1734. doi: 10.3390/s22051734.
3
Multimodal deep learning for biomedical data fusion: a review.
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbab569.
4
A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets.
Vis Comput. 2022;38(8):2939-2970. doi: 10.1007/s00371-021-02166-7. Epub 2021 Jun 10.
5
Advances in multimodal data fusion in neuroimaging: Overview, challenges, and novel orientation.
Inf Fusion. 2020 Dec;64:149-187. doi: 10.1016/j.inffus.2020.07.006. Epub 2020 Jul 17.
6
A Survey on Deep Learning for Multimodal Data Fusion.
Neural Comput. 2020 May;32(5):829-864. doi: 10.1162/neco_a_01273. Epub 2020 Mar 18.
7
Multimodal Machine Learning: A Survey and Taxonomy.
IEEE Trans Pattern Anal Mach Intell. 2019 Feb;41(2):423-443. doi: 10.1109/TPAMI.2018.2798607. Epub 2018 Jan 25.
8
Representation learning: a review and new perspectives.
IEEE Trans Pattern Anal Mach Intell. 2013 Aug;35(8):1798-828. doi: 10.1109/TPAMI.2013.50.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验