Suppr超能文献

使用听觉神经平均发放率和峰电位时间神经线索预测言语嵌合体可懂度

Predictions of Speech Chimaera Intelligibility Using Auditory Nerve Mean-Rate and Spike-Timing Neural Cues.

作者信息

Wirtzfeld Michael R, Ibrahim Rasha A, Bruce Ian C

机构信息

Department of Electrical and Computer Engineering, McMaster University, 1280 Main Street West, Hamilton, L8S 4K1, ON, Canada.

出版信息

J Assoc Res Otolaryngol. 2017 Oct;18(5):687-710. doi: 10.1007/s10162-017-0627-7. Epub 2017 Jul 26.

Abstract

Perceptual studies of speech intelligibility have shown that slow variations of acoustic envelope (ENV) in a small set of frequency bands provides adequate information for good perceptual performance in quiet, whereas acoustic temporal fine-structure (TFS) cues play a supporting role in background noise. However, the implications for neural coding are prone to misinterpretation because the mean-rate neural representation can contain recovered ENV cues from cochlear filtering of TFS. We investigated ENV recovery and spike-time TFS coding using objective measures of simulated mean-rate and spike-timing neural representations of chimaeric speech, in which either the ENV or the TFS is replaced by another signal. We (a) evaluated the levels of mean-rate and spike-timing neural information for two categories of chimaeric speech, one retaining ENV cues and the other TFS; (b) examined the level of recovered ENV from cochlear filtering of TFS speech; (c) examined and quantified the contribution to recovered ENV from spike-timing cues using a lateral inhibition network (LIN); and (d) constructed linear regression models with objective measures of mean-rate and spike-timing neural cues and subjective phoneme perception scores from normal-hearing listeners. The mean-rate neural cues from the original ENV and recovered ENV partially accounted for perceptual score variability, with additional variability explained by the recovered ENV from the LIN-processed TFS speech. The best model predictions of chimaeric speech intelligibility were found when both the mean-rate and spike-timing neural cues were included, providing further evidence that spike-time coding of TFS cues is important for intelligibility when the speech envelope is degraded.

摘要

言语可懂度的感知研究表明,在一小部分频带中,声学包络(ENV)的缓慢变化为安静环境下良好的感知性能提供了足够的信息,而声学时间精细结构(TFS)线索在背景噪声中起辅助作用。然而,对于神经编码的影响容易产生误解,因为平均速率神经表征可能包含从TFS的耳蜗滤波中恢复的ENV线索。我们使用嵌合语音的模拟平均速率和尖峰时间神经表征的客观测量方法,研究了ENV恢复和尖峰时间TFS编码,其中ENV或TFS被另一个信号取代。我们(a)评估了两类嵌合语音的平均速率和尖峰时间神经信息水平,一类保留ENV线索,另一类保留TFS;(b)检查了从TFS语音的耳蜗滤波中恢复的ENV水平;(c)使用侧向抑制网络(LIN)检查并量化了尖峰时间线索对恢复的ENV的贡献;(d)构建了线性回归模型,该模型包含平均速率和尖峰时间神经线索的客观测量以及正常听力听众的主观音素感知分数。来自原始ENV和恢复的ENV的平均速率神经线索部分解释了感知分数的变异性,LIN处理的TFS语音恢复的ENV解释了额外的变异性。当同时包含平均速率和尖峰时间神经线索时,发现了对嵌合语音可懂度的最佳模型预测,这进一步证明了当语音包络退化时,TFS线索的尖峰时间编码对可懂度很重要。

相似文献

1
Predictions of Speech Chimaera Intelligibility Using Auditory Nerve Mean-Rate and Spike-Timing Neural Cues.
J Assoc Res Otolaryngol. 2017 Oct;18(5):687-710. doi: 10.1007/s10162-017-0627-7. Epub 2017 Jul 26.
3
Role of Binaural Temporal Fine Structure and Envelope Cues in Cocktail-Party Listening.
J Neurosci. 2016 Aug 3;36(31):8250-7. doi: 10.1523/JNEUROSCI.4421-15.2016.
10
The effects of noise vocoding on speech quality perception.
Hear Res. 2014 Mar;309:75-83. doi: 10.1016/j.heares.2013.11.011. Epub 2013 Dec 11.

引用本文的文献

1
Speech sound discrimination in background noise across the lifespan: a comparative study in Mongolian gerbils and humans.
Front Aging Neurosci. 2025 Jun 9;17:1570305. doi: 10.3389/fnagi.2025.1570305. eCollection 2025.
4
Predicting speech intelligibility in hearing-impaired listeners using a physiologically inspired auditory model.
Hear Res. 2022 Dec;426:108553. doi: 10.1016/j.heares.2022.108553. Epub 2022 Jun 9.
5
A Machine Learning-based Neural Implant Front End for Inducing Naturalistic Firing.
Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:5713-5718. doi: 10.1109/EMBC46164.2021.9630548.
6
Noise-Sensitive But More Precise Subcortical Representations Coexist with Robust Cortical Encoding of Natural Vocalizations.
J Neurosci. 2020 Jul 1;40(27):5228-5246. doi: 10.1523/JNEUROSCI.2731-19.2020. Epub 2020 May 22.

本文引用的文献

1
Reference-Free Assessment of Speech Intelligibility Using Bispectrum of an Auditory Neurogram.
PLoS One. 2016 Mar 11;11(3):e0150415. doi: 10.1371/journal.pone.0150415. eCollection 2016.
4
Consonant identification using temporal fine structure and recovered envelope cues.
J Acoust Soc Am. 2014 Apr;135(4):2078-90. doi: 10.1121/1.4865920.
5
Updated parameters and expanded simulation options for a model of the auditory periphery.
J Acoust Soc Am. 2014 Jan;135(1):283-6. doi: 10.1121/1.4837815.
6
A multi-resolution envelope-power based model for speech intelligibility.
J Acoust Soc Am. 2013 Jul;134(1):436-46. doi: 10.1121/1.4807563.
7
On the controversy about the sharpness of human cochlear tuning.
J Assoc Res Otolaryngol. 2013 Oct;14(5):673-86. doi: 10.1007/s10162-013-0397-9. Epub 2013 May 21.
10

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验