Suppr超能文献

基于脑启发的多感官整合神经网络,用于通过时空动力学和深度学习进行跨模态识别。

Brain-inspired multisensory integration neural network for cross-modal recognition through spatiotemporal dynamics and deep learning.

作者信息

Yu Haitao, Zhao Quanfa

机构信息

School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072 China.

出版信息

Cogn Neurodyn. 2024 Dec;18(6):3615-3628. doi: 10.1007/s11571-023-09932-4. Epub 2023 Feb 2.

Abstract

The integration and interaction of cross-modal senses in brain neural networks can facilitate high-level cognitive functionalities. In this work, we proposed a bioinspired multisensory integration neural network (MINN) that integrates visual and audio senses for recognizing multimodal information across different sensory modalities. This deep learning-based model incorporates a cascading framework of parallel convolutional neural networks (CNNs) for extracting intrinsic features from visual and audio inputs, and a recurrent neural network (RNN) for multimodal information integration and interaction. The network was trained using synthetic training data generated for digital recognition tasks. It was revealed that the spatial and temporal features extracted from visual and audio inputs by CNNs were encoded in subspaces orthogonal with each other. In integration epoch, network state evolved along quasi-rotation-symmetric trajectories and a structural manifold with stable attractors was formed in RNN, supporting accurate cross-modal recognition. We further evaluated the robustness of the MINN algorithm with noisy inputs and asynchronous digital inputs. Experimental results demonstrated the superior performance of MINN for flexible integration and accurate recognition of multisensory information with distinct sense properties. The present results provide insights into the computational principles governing multisensory integration and a comprehensive neural network model for brain-inspired intelligence.

摘要

大脑神经网络中跨模态感官的整合与交互能够促进高级认知功能。在这项工作中,我们提出了一种受生物启发的多感官整合神经网络(MINN),它整合视觉和听觉感官,以识别跨不同感官模态的多模态信息。这个基于深度学习的模型包含一个并行卷积神经网络(CNN)的级联框架,用于从视觉和音频输入中提取内在特征,以及一个循环神经网络(RNN)用于多模态信息的整合与交互。该网络使用为数字识别任务生成的合成训练数据进行训练。结果表明,CNN从视觉和音频输入中提取的空间和时间特征在相互正交的子空间中进行编码。在整合阶段,网络状态沿着准旋转对称轨迹演化,并且在RNN中形成了具有稳定吸引子的结构流形,支持准确的跨模态识别。我们进一步评估了MINN算法在有噪声输入和异步数字输入情况下的鲁棒性。实验结果证明了MINN在灵活整合和准确识别具有不同感官特性的多感官信息方面的卓越性能。目前的结果为多感官整合的计算原理提供了见解,并为受大脑启发的智能提供了一个全面的神经网络模型。

相似文献

本文引用的文献

7
Artificial Neural Networks for Neuroscientists: A Primer.人工神经网络:神经科学家入门指南。
Neuron. 2020 Sep 23;107(6):1048-1070. doi: 10.1016/j.neuron.2020.09.005.
8
Engineering recurrent neural networks from task-relevant manifolds and dynamics.从任务相关流形和动力学中设计递归神经网络。
PLoS Comput Biol. 2020 Aug 12;16(8):e1008128. doi: 10.1371/journal.pcbi.1008128. eCollection 2020 Aug.
9
Understanding the computation of time using neural network models.理解神经网络模型中的时间计算。
Proc Natl Acad Sci U S A. 2020 May 12;117(19):10530-10540. doi: 10.1073/pnas.1921609117. Epub 2020 Apr 27.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验