基于储层计算的动态预测编码实现了抗噪声多感官语音识别。

Dynamical predictive coding with reservoir computing performs noise-robust multi-sensory speech recognition.

作者信息

Yonemura Yoshihiro, Katori Yuichi

机构信息

Graduate of System Information Science, Future University Hakodate, Hakodate, Hokkaido, Japan.

International Research Center for Neurointelligence (IRCN), The University of Tokyo, Tokyo, Japan.

出版信息

Front Comput Neurosci. 2024 Sep 23;18:1464603. doi: 10.3389/fncom.2024.1464603. eCollection 2024.

DOI:10.3389/fncom.2024.1464603

PMID:39376576

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11456454/

Abstract

Multi-sensory integration is a perceptual process through which the brain synthesizes a unified perception by integrating inputs from multiple sensory modalities. A key issue is understanding how the brain performs multi-sensory integrations using a common neural basis in the cortex. A cortical model based on reservoir computing has been proposed to elucidate the role of recurrent connectivity among cortical neurons in this process. Reservoir computing is well-suited for time series processing, such as speech recognition. This inquiry focuses on extending a reservoir computing-based cortical model to encompass multi-sensory integration within the cortex. This research introduces a dynamical model of multi-sensory speech recognition, leveraging predictive coding combined with reservoir computing. Predictive coding offers a framework for the hierarchical structure of the cortex. The model integrates reliability weighting, derived from the computational theory of multi-sensory integration, to adapt to multi-sensory time series processing. The model addresses a multi-sensory speech recognition task, necessitating the management of complex time series. We observed that the reservoir effectively recognizes speech by extracting time-contextual information and weighting sensory inputs according to sensory noise. These findings indicate that the dynamic properties of recurrent networks are applicable to multi-sensory time series processing, positioning reservoir computing as a suitable model for multi-sensory integration.

摘要

多感官整合是一种感知过程，通过该过程大脑整合来自多种感官模态的输入，从而合成统一的感知。一个关键问题是理解大脑如何利用皮层中的共同神经基础进行多感官整合。已提出一种基于储层计算的皮层模型，以阐明皮层神经元之间的循环连接在此过程中的作用。储层计算非常适合时间序列处理，例如语音识别。这项研究重点在于扩展基于储层计算的皮层模型，以涵盖皮层内的多感官整合。本研究引入了一种多感官语音识别动态模型，利用预测编码与储层计算相结合的方法。预测编码为皮层的层次结构提供了一个框架。该模型整合了源自多感官整合计算理论的可靠性加权，以适应多感官时间序列处理。该模型处理一项多感官语音识别任务，这需要管理复杂的时间序列。我们观察到，该储层通过提取时间上下文信息并根据感官噪声对感官输入进行加权，有效地识别语音。这些发现表明，循环网络的动态特性适用于多感官时间序列处理，将储层计算定位为多感官整合的合适模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1e30/11456454/95fe0de1e566/fncom-18-1464603-g0001.jpg

相似文献

Dynamical predictive coding with reservoir computing performs noise-robust multi-sensory speech recognition.基于储层计算的动态预测编码实现了抗噪声多感官语音识别。

Front Comput Neurosci. 2024 Sep 23;18:1464603. doi: 10.3389/fncom.2024.1464603. eCollection 2024.

Reservoir Computing Properties of Neural Dynamics in Prefrontal Cortex.前额叶皮层神经动力学的储层计算特性

PLoS Comput Biol. 2016 Jun 10;12(6):e1004967. doi: 10.1371/journal.pcbi.1004967. eCollection 2016 Jun.

Performance of a Computational Model of the Mammalian Olfactory System哺乳动物嗅觉系统计算模型的性能

Evaluation of the computational capabilities of a memristive random network (MN) under the context of reservoir computing.评估在储层计算背景下忆阻随机网络 (MN) 的计算能力。

Neural Netw. 2018 Oct;106:223-236. doi: 10.1016/j.neunet.2018.07.003. Epub 2018 Jul 25.

Recent advances in physical reservoir computing: A review.近期物理存储计算的进展：综述。

Neural Netw. 2019 Jul;115:100-123. doi: 10.1016/j.neunet.2019.03.005. Epub 2019 Mar 20.

Near-Sensor Reservoir Computing for Gait Recognition via a Multi-Gate Electrolyte-Gated Transistor.基于多栅电解质门控晶体管的传感器近存计算在步态识别中的应用。

Adv Sci (Weinh). 2023 May;10(15):e2300471. doi: 10.1002/advs.202300471. Epub 2023 Mar 22.

Learning shapes cortical dynamics to enhance integration of relevant sensory input.学习塑造皮质动力学，以增强相关感觉输入的整合。

Neuron. 2023 Jan 4;111(1):106-120.e10. doi: 10.1016/j.neuron.2022.10.001. Epub 2022 Oct 24.

Biological neurons act as generalization filters in reservoir computing.生物神经元在储层计算中充当泛化滤波器。

Proc Natl Acad Sci U S A. 2023 Jun 20;120(25):e2217008120. doi: 10.1073/pnas.2217008120. Epub 2023 Jun 12.

Association between different sensory modalities based on concurrent time series data obtained by a collaborative reservoir computing model.基于协作储层计算模型获得的并发时间序列数据的不同感觉模态之间的关联。

Sci Rep. 2023 Jan 4;13(1):173. doi: 10.1038/s41598-023-27385-x.

Narrative event segmentation in the cortical reservoir.皮质蓄水池中的叙述事件分割。

PLoS Comput Biol. 2021 Oct 7;17(10):e1008993. doi: 10.1371/journal.pcbi.1008993. eCollection 2021 Oct.

引用本文的文献

Modeling autonomous shifts between focus state and mind-wandering using a predictive-coding-inspired variational recurrent neural network.使用受预测编码启发的变分递归神经网络对注意力状态和走神之间的自主转换进行建模。

Front Comput Neurosci. 2025 Jul 2;19:1578135. doi: 10.3389/fncom.2025.1578135. eCollection 2025.

本文引用的文献

The Principle of Inverse Effectiveness in Audiovisual Speech Perception.视听言语感知中的逆有效性原则。

Front Hum Neurosci. 2019 Sep 26;13:335. doi: 10.3389/fnhum.2019.00335. eCollection 2019.

Frontal cortex function as derived from hierarchical predictive coding.前额皮质功能源自于分层预测编码。

Sci Rep. 2018 Mar 1;8(1):3843. doi: 10.1038/s41598-018-21407-9.

Neural Elements for Predictive Coding.用于预测编码的神经元件。

Front Psychol. 2016 Nov 18;7:1792. doi: 10.3389/fpsyg.2016.01792. eCollection 2016.

Reservoir Computing Properties of Neural Dynamics in Prefrontal Cortex.前额叶皮层神经动力学的储层计算特性

PLoS Comput Biol. 2016 Jun 10;12(6):e1004967. doi: 10.1371/journal.pcbi.1004967. eCollection 2016 Jun.

Predictive coding and multisensory integration: an attentional account of the multisensory mind.预测编码与多感官整合：对多感官思维的一种注意力解释。

Front Integr Neurosci. 2015 Mar 26;9:19. doi: 10.3389/fnint.2015.00019. eCollection 2015.

Cortical hierarchies perform Bayesian causal inference in multisensory perception.皮质层级在多感官感知中执行贝叶斯因果推理。

PLoS Biol. 2015 Feb 24;13(2):e1002073. doi: 10.1371/journal.pbio.1002073. eCollection 2015 Feb.

Dynamic changes in superior temporal sulcus connectivity during perception of noisy audiovisual speech.在感知嘈杂视听语音期间，上颞沟连接的动态变化。

J Neurosci. 2011 Feb 2;31(5):1704-14. doi: 10.1523/JNEUROSCI.4853-10.2011.

Neural substrates of reliability-weighted visual-tactile multisensory integration.可靠性加权视觉-触觉多感觉整合的神经基础。

Front Syst Neurosci. 2010 Jun 23;4:25. doi: 10.3389/fnsys.2010.00025. eCollection 2010.

Generating coherent patterns of activity from chaotic neural networks.从混沌神经网络中生成连贯的活动模式。

Neuron. 2009 Aug 27;63(4):544-57. doi: 10.1016/j.neuron.2009.07.018.

Lip-reading aids word recognition most in moderate noise: a Bayesian explanation using high-dimensional feature space.唇读在中等噪声环境中对单词识别的帮助最大：基于高维特征空间的贝叶斯解释。

PLoS One. 2009;4(3):e4638. doi: 10.1371/journal.pone.0004638. Epub 2009 Mar 4.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于储层计算的动态预测编码实现了抗噪声多感官语音识别。

Dynamical predictive coding with reservoir computing performs noise-robust multi-sensory speech recognition.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献