• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

言语分类揭示了早期时间相干性处理在听觉场景分析中的作用。

Speech Categorization Reveals the Role of Early-Stage Temporal-Coherence Processing in Auditory Scene Analysis.

机构信息

Weldon School of Biomedical Engineering, Purdue University, West Lafayette, Indiana 47907

Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213.

出版信息

J Neurosci. 2022 Jan 12;42(2):240-254. doi: 10.1523/JNEUROSCI.1610-21.2021. Epub 2021 Nov 11.

DOI:10.1523/JNEUROSCI.1610-21.2021
PMID:34764159
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8802934/
Abstract

Temporal coherence of sound fluctuations across spectral channels is thought to aid auditory grouping and scene segregation. Although prior studies on the neural bases of temporal-coherence processing focused mostly on cortical contributions, neurophysiological evidence suggests that temporal-coherence-based scene analysis may start as early as the cochlear nucleus (i.e., the first auditory region supporting cross-channel processing over a wide frequency range). Accordingly, we hypothesized that aspects of temporal-coherence processing that could be realized in early auditory areas may shape speech understanding in noise. We then explored whether physiologically plausible computational models could account for results from a behavioral experiment that measured consonant categorization in different masking conditions. We tested whether within-channel masking of target-speech modulations predicted consonant confusions across the different conditions and whether predictions were improved by adding across-channel temporal-coherence processing mirroring the computations known to exist in the cochlear nucleus. Consonant confusions provide a rich characterization of error patterns in speech categorization, and are thus crucial for rigorously testing models of speech perception; however, to the best of our knowledge, they have not been used in prior studies of scene analysis. We find that within-channel modulation masking can reasonably account for category confusions, but that it fails when temporal fine structure cues are unavailable. However, the addition of across-channel temporal-coherence processing significantly improves confusion predictions across all tested conditions. Our results suggest that temporal-coherence processing strongly shapes speech understanding in noise and that physiological computations that exist early along the auditory pathway may contribute to this process. Temporal coherence of sound fluctuations across distinct frequency channels is thought to be important for auditory scene analysis. Prior studies on the neural bases of temporal-coherence processing focused mostly on cortical contributions, and it was unknown whether speech understanding in noise may be shaped by across-channel processing that exists in earlier auditory areas. Using physiologically plausible computational modeling to predict consonant confusions across different listening conditions, we find that across-channel temporal coherence contributes significantly to scene analysis and speech perception and that such processing may arise in the auditory pathway as early as the brainstem. By virtue of providing a richer characterization of error patterns not obtainable with just intelligibility scores, consonant confusions yield unique insight into scene analysis mechanisms.

摘要

不同频率通道之间声音波动的时间相干性被认为有助于听觉分组和场景分离。尽管先前关于时间相干性处理的神经基础的研究主要集中在皮质贡献上,但神经生理学证据表明,基于时间相干性的场景分析可能早在耳蜗核(即支持宽频带跨通道处理的第一个听觉区域)就开始了。因此,我们假设可以在早期听觉区域实现的时间相干性处理的各个方面可能会影响噪声中的语音理解。然后,我们探讨了是否可以用生理上合理的计算模型来解释行为实验的结果,该实验测量了不同掩蔽条件下的辅音分类。我们测试了目标语音调制的通道内掩蔽是否可以预测不同条件下的辅音混淆,以及通过添加反映耳蜗核中已知计算的跨通道时间相干性处理是否可以改善预测。辅音混淆为语音分类中的错误模式提供了丰富的描述,因此对于严格测试语音感知模型至关重要;然而,据我们所知,它们在以前的场景分析研究中尚未被使用。我们发现,通道内调制掩蔽可以合理地解释类别混淆,但在没有时间精细结构线索的情况下会失败。然而,添加跨通道时间相干性处理可以显著改善所有测试条件下的混淆预测。我们的结果表明,时间相干性处理强烈影响噪声中的语音理解,并且听觉通路早期存在的生理计算可能有助于该过程。不同频率通道之间声音波动的时间相干性被认为对听觉场景分析很重要。先前关于时间相干性处理的神经基础的研究主要集中在皮质贡献上,并且不知道噪声中的语音理解是否可能受到早期听觉区域中存在的跨通道处理的影响。使用生理上合理的计算模型来预测不同听力条件下的辅音混淆,我们发现跨通道时间相干性对场景分析和语音感知有重要贡献,并且这种处理可能早在脑干就出现在听觉通路中。由于提供了比可理解度分数更丰富的错误模式描述,辅音混淆为场景分析机制提供了独特的见解。

相似文献

1
Speech Categorization Reveals the Role of Early-Stage Temporal-Coherence Processing in Auditory Scene Analysis.言语分类揭示了早期时间相干性处理在听觉场景分析中的作用。
J Neurosci. 2022 Jan 12;42(2):240-254. doi: 10.1523/JNEUROSCI.1610-21.2021. Epub 2021 Nov 11.
2
Modulation masking and fine structure shape neural envelope coding to predict speech intelligibility across diverse listening conditions.调制掩蔽和精细结构形状神经包络编码可预测各种听力条件下的言语可懂度。
J Acoust Soc Am. 2021 Sep;150(3):2230. doi: 10.1121/10.0006385.
3
Auditory models of suprathreshold distortion and speech intelligibility in persons with impaired hearing.听力受损者的超阈值失真与言语可懂度的听觉模型。
J Am Acad Audiol. 2013 Apr;24(4):307-28. doi: 10.3766/jaaa.24.4.6.
4
Impact of reduced spectral resolution on temporal-coherence-based source segregation.光谱分辨率降低对基于时间相干性的声源分离的影响。
J Acoust Soc Am. 2024 Dec 1;156(6):3862-3876. doi: 10.1121/10.0034545.
5
Comodulation masking release determined in the mouse (Mus musculus) using a flanking-band paradigm.使用侧翼带范式在小鼠(Mus musculus)中测定共调掩蔽释放。
J Assoc Res Otolaryngol. 2010 Mar;11(1):79-88. doi: 10.1007/s10162-009-0186-7. Epub 2009 Sep 10.
6
Electric and acoustic harmonic integration predicts speech-in-noise performance in hybrid cochlear implant users.电声谐波整合可预测混合式人工耳蜗使用者在噪声环境中的言语表现。
Hear Res. 2018 Sep;367:223-230. doi: 10.1016/j.heares.2018.06.016. Epub 2018 Jun 28.
7
The role of temporal coherence and temporal predictability in the build-up of auditory grouping.在听觉分组形成过程中,时间连贯性和时间可预测性的作用。
Sci Rep. 2022 Aug 25;12(1):14493. doi: 10.1038/s41598-022-18583-0.
8
The use of confusion patterns to evaluate the neural basis for concurrent vowel identification.利用混淆模式评估并发元音识别的神经基础。
J Acoust Soc Am. 2013 Oct;134(4):2988-3000. doi: 10.1121/1.4820888.
9
Neural indices of phonemic discrimination and sentence-level speech intelligibility in quiet and noise: A P3 study.安静和噪声环境下音素辨别与句子层面言语可懂度的神经指标:一项P3研究
Hear Res. 2017 Jul;350:58-67. doi: 10.1016/j.heares.2017.04.009. Epub 2017 Apr 18.
10
Spatial Release From Masking in Simulated Cochlear Implant Users With and Without Access to Low-Frequency Acoustic Hearing.有和没有低频听觉的模拟人工耳蜗使用者的掩蔽空间释放
Trends Hear. 2015 Dec 30;19:2331216515616940. doi: 10.1177/2331216515616940.

引用本文的文献

1
Impact of reduced spectral resolution on temporal-coherence-based source segregation.光谱分辨率降低对基于时间相干性的声源分离的影响。
J Acoust Soc Am. 2024 Dec 1;156(6):3862-3876. doi: 10.1121/10.0034545.
2
Binaural fusion: Complexities in definition and measurement.双耳融合:定义和测量的复杂性。
J Acoust Soc Am. 2024 Oct 1;156(4):2395-2408. doi: 10.1121/10.0030476.
3
Hierarchical differences in the encoding of amplitude modulation in the subcortical auditory system of awake nonhuman primates.清醒非人类灵长类动物下丘脑中调制幅度编码的层次差异。
J Neurophysiol. 2024 Sep 1;132(3):1098-1114. doi: 10.1152/jn.00329.2024. Epub 2024 Aug 14.
4
Impact of Reduced Spectral Resolution on Temporal-Coherence-Based Source Segregation.光谱分辨率降低对基于时间相干性的声源分离的影响。
bioRxiv. 2024 Mar 13:2024.03.11.584489. doi: 10.1101/2024.03.11.584489.
5
Peripheral Neural Synchrony in Postlingually Deafened Adult Cochlear Implant Users.后天耳聋成年人工耳蜗使用者的外周神经同步性。
Ear Hear. 2024;45(5):1125-1137. doi: 10.1097/AUD.0000000000001502. Epub 2024 Mar 20.
6
Peripheral neural synchrony in post-lingually deafened adult cochlear implant users.语后聋成年人工耳蜗植入者的外周神经同步性
medRxiv. 2024 Feb 16:2023.07.07.23292369. doi: 10.1101/2023.07.07.23292369.
7
Web-based psychoacoustics: Hearing screening, infrastructure, and validation.基于网络的心理声学:听力筛查、基础设施和验证。
Behav Res Methods. 2024 Mar;56(3):1433-1448. doi: 10.3758/s13428-023-02101-9. Epub 2023 Jun 8.
8
Effect of Reverberation on Neural Responses to Natural Speech in Rabbit Auditory Midbrain: No Evidence for a Neural Dereverberation Mechanism.混响对兔听觉中脑对自然语音神经反应的影响:无神经去混响机制的证据。
eNeuro. 2023 May 10;10(5). doi: 10.1523/ENEURO.0447-22.2023. Print 2023 May.
9
FORUM: Remote testing for psychological and physiological acoustics.论坛:心理声学和生理声学的远程测试
J Acoust Soc Am. 2022 May;151(5):3116. doi: 10.1121/10.0010422.

本文引用的文献

1
Web-based psychoacoustics: Hearing screening, infrastructure, and validation.基于网络的心理声学:听力筛查、基础设施和验证。
Behav Res Methods. 2024 Mar;56(3):1433-1448. doi: 10.3758/s13428-023-02101-9. Epub 2023 Jun 8.
2
A comparative study of eight human auditory models of monaural processing.八种单耳听觉处理的人体听觉模型的比较研究。
Acta Acust (2020). 2022;6. doi: 10.1051/aacus/2022008. Epub 2022 May 4.
3
Sensitivity of neural responses in the inferior colliculus to statistical features of sound textures.下丘脑中神经反应对声音纹理统计特征的敏感性。
Hear Res. 2021 Dec;412:108357. doi: 10.1016/j.heares.2021.108357. Epub 2021 Oct 14.
4
Temporal fine structure influences voicing confusions for consonant identification in multi-talker babble.时频结构对多说话人噪声环境下辅音识别的浊音混淆有影响。
J Acoust Soc Am. 2021 Oct;150(4):2664. doi: 10.1121/10.0006527.
5
Modulation masking and fine structure shape neural envelope coding to predict speech intelligibility across diverse listening conditions.调制掩蔽和精细结构形状神经包络编码可预测各种听力条件下的言语可懂度。
J Acoust Soc Am. 2021 Sep;150(3):2230. doi: 10.1121/10.0006385.
6
Predicting the effects of periodicity on the intelligibility of masked speech: An evaluation of different modelling approaches and their limitations.预测周期性对掩蔽语音可懂度的影响:不同建模方法的评估及其局限性。
J Acoust Soc Am. 2019 Oct;146(4):2562. doi: 10.1121/1.5129050.
7
A phenomenological model of the synapse between the inner hair cell and auditory nerve: Implications of limited neurotransmitter release sites.内毛细胞与听神经之间突触的现象学模型:神经递质释放位点有限的影响。
Hear Res. 2018 Mar;360:40-54. doi: 10.1016/j.heares.2017.12.016. Epub 2017 Dec 28.
8
Predicting speech intelligibility based on a correlation metric in the envelope power spectrum domain.基于包络功率谱域中的相关度量预测语音可懂度。
J Acoust Soc Am. 2016 Oct;140(4):2670. doi: 10.1121/1.4964505.
9
Comodulation masking release in the inferior colliculus by combined signal enhancement and masker reduction.下丘中通过联合信号增强和掩蔽声降低实现的共调制掩蔽释放。
J Neurophysiol. 2017 Feb;117(2):853-867. doi: 10.1152/jn.00191.2016. Epub 2016 Oct 26.
10
Speech Coding in the Brain: Representation of Vowel Formants by Midbrain Neurons Tuned to Sound Fluctuations.大脑中的语音编码:中脑神经元对声音波动的调整,以代表元音共振峰。
eNeuro. 2015 Jul 20;2(4). doi: 10.1523/ENEURO.0004-15.2015. eCollection 2015 Jul-Aug.