Suppr超能文献

基于储层计算的动态预测编码实现了抗噪声多感官语音识别。

Dynamical predictive coding with reservoir computing performs noise-robust multi-sensory speech recognition.

作者信息

Yonemura Yoshihiro, Katori Yuichi

机构信息

Graduate of System Information Science, Future University Hakodate, Hakodate, Hokkaido, Japan.

International Research Center for Neurointelligence (IRCN), The University of Tokyo, Tokyo, Japan.

出版信息

Front Comput Neurosci. 2024 Sep 23;18:1464603. doi: 10.3389/fncom.2024.1464603. eCollection 2024.

Abstract

Multi-sensory integration is a perceptual process through which the brain synthesizes a unified perception by integrating inputs from multiple sensory modalities. A key issue is understanding how the brain performs multi-sensory integrations using a common neural basis in the cortex. A cortical model based on reservoir computing has been proposed to elucidate the role of recurrent connectivity among cortical neurons in this process. Reservoir computing is well-suited for time series processing, such as speech recognition. This inquiry focuses on extending a reservoir computing-based cortical model to encompass multi-sensory integration within the cortex. This research introduces a dynamical model of multi-sensory speech recognition, leveraging predictive coding combined with reservoir computing. Predictive coding offers a framework for the hierarchical structure of the cortex. The model integrates reliability weighting, derived from the computational theory of multi-sensory integration, to adapt to multi-sensory time series processing. The model addresses a multi-sensory speech recognition task, necessitating the management of complex time series. We observed that the reservoir effectively recognizes speech by extracting time-contextual information and weighting sensory inputs according to sensory noise. These findings indicate that the dynamic properties of recurrent networks are applicable to multi-sensory time series processing, positioning reservoir computing as a suitable model for multi-sensory integration.

摘要

多感官整合是一种感知过程,通过该过程大脑整合来自多种感官模态的输入,从而合成统一的感知。一个关键问题是理解大脑如何利用皮层中的共同神经基础进行多感官整合。已提出一种基于储层计算的皮层模型,以阐明皮层神经元之间的循环连接在此过程中的作用。储层计算非常适合时间序列处理,例如语音识别。这项研究重点在于扩展基于储层计算的皮层模型,以涵盖皮层内的多感官整合。本研究引入了一种多感官语音识别动态模型,利用预测编码与储层计算相结合的方法。预测编码为皮层的层次结构提供了一个框架。该模型整合了源自多感官整合计算理论的可靠性加权,以适应多感官时间序列处理。该模型处理一项多感官语音识别任务,这需要管理复杂的时间序列。我们观察到,该储层通过提取时间上下文信息并根据感官噪声对感官输入进行加权,有效地识别语音。这些发现表明,循环网络的动态特性适用于多感官时间序列处理,将储层计算定位为多感官整合的合适模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1e30/11456454/95fe0de1e566/fncom-18-1464603-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验