Suppr超能文献

作为人类语音识别神经计算模型的循环神经网络。

Recurrent neural networks as neuro-computational models of human speech recognition.

作者信息

Brodbeck Christian, Hannagan Thomas, Magnuson James S

机构信息

Department of Computing and Software, McMaster University, Hamilton, Ontario, Canada.

Department of Psychological Sciences, University of Connecticut, Storrs, Connecticut, United States of America.

出版信息

PLoS Comput Biol. 2025 Jul 28;21(7):e1013244. doi: 10.1371/journal.pcbi.1013244. eCollection 2025 Jul.

Abstract

Human speech recognition transforms a continuous acoustic signal into categorical linguistic units, by aggregating information that is distributed in time. It has been suggested that this kind of information processing may be understood through the computations of a Recurrent Neural Network (RNN) that receives input frame by frame, linearly in time, but builds an incremental representation of this input through a continually evolving internal state. While RNNs can simulate several key behavioral observations about human speech and language processing, it is unknown whether RNNs also develop computational dynamics that resemble human neural speech processing. Here we show that the internal dynamics of long short-term memory (LSTM) RNNs, trained to recognize speech from auditory spectrograms, predict human neural population responses to the same stimuli, beyond predictions from auditory features. Variations in the RNN architecture motivated by cognitive principles further improved this predictive power. Specifically, modifications that allow more human-like phonetic competition also led to more human-like temporal dynamics. Overall, our results suggest that RNNs provide plausible computational models of the cortical processes supporting human speech recognition.

摘要

人类语音识别通过聚合随时间分布的信息,将连续的声学信号转换为分类语言单元。有人提出,这种信息处理方式可以通过循环神经网络(RNN)的计算来理解,该网络逐帧接收输入,在时间上呈线性,但通过不断演变的内部状态构建输入的增量表示。虽然RNN可以模拟关于人类语音和语言处理的几个关键行为观察结果,但尚不清楚RNN是否也会发展出类似于人类神经语音处理的计算动态。在这里,我们表明,经过训练以从听觉频谱图中识别语音的长短期记忆(LSTM)RNN的内部动态,能够预测人类神经群体对相同刺激的反应,超出了听觉特征的预测。受认知原理启发的RNN架构变化进一步提高了这种预测能力。具体而言,允许更类似人类语音竞争的修改也导致了更类似人类的时间动态。总体而言,我们的结果表明,RNN为支持人类语音识别的皮层过程提供了合理的计算模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcdf/12331064/84825d0d5c97/pcbi.1013244.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验