Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA.
School of Biomedical Engineering & State Key Laboratory of Advanced Medical Materials and Devices, ShanghaiTech University, Shanghai, China.
Nat Neurosci. 2023 Dec;26(12):2213-2225. doi: 10.1038/s41593-023-01468-4. Epub 2023 Oct 30.
The human auditory system extracts rich linguistic abstractions from speech signals. Traditional approaches to understanding this complex process have used linear feature-encoding models, with limited success. Artificial neural networks excel in speech recognition tasks and offer promising computational models of speech processing. We used speech representations in state-of-the-art deep neural network (DNN) models to investigate neural coding from the auditory nerve to the speech cortex. Representations in hierarchical layers of the DNN correlated well with the neural activity throughout the ascending auditory system. Unsupervised speech models performed at least as well as other purely supervised or fine-tuned models. Deeper DNN layers were better correlated with the neural activity in the higher-order auditory cortex, with computations aligned with phonemic and syllabic structures in speech. Accordingly, DNN models trained on either English or Mandarin predicted cortical responses in native speakers of each language. These results reveal convergence between DNN model representations and the biological auditory pathway, offering new approaches for modeling neural coding in the auditory cortex.
人类听觉系统从语音信号中提取丰富的语言抽象信息。传统的理解这一复杂过程的方法使用了线性特征编码模型,但效果有限。人工神经网络在语音识别任务中表现出色,并提供了有前途的语音处理计算模型。我们使用最先进的深度神经网络 (DNN) 模型中的语音表示来研究从听神经到言语皮层的神经编码。DNN 分层中的表示与整个上行听觉系统中的神经活动密切相关。无监督的语音模型的表现至少与其他纯监督或微调模型一样好。更深的 DNN 层与高级听觉皮层中的神经活动相关性更好,其计算与语音中的音位和音节结构一致。因此,在英语或普通话上训练的 DNN 模型可以预测每种语言的母语者的皮质反应。这些结果揭示了 DNN 模型表示与生物听觉通路之间的趋同,为在听觉皮层中对神经编码进行建模提供了新方法。