• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

中级语音和噪声统计特性的干扰是自然环境噪声中人类语音识别敏感性的基础。

Interference of mid-level speech and noise statistics underlies human speech recognition sensitivity in natural environmental noise.

作者信息

Clonan Alex C, Zhai Xiu, Stevenson Ian H, Escabí Monty A

机构信息

Electrical and Computer Engineering, University of Connecticut, Storrs, CT 06269.

Biomedical Engineering, University of Connecticut, Storrs, CT 06269.

出版信息

J Neurosci. 2025 Jul 8. doi: 10.1523/JNEUROSCI.1751-24.2025.

DOI:10.1523/JNEUROSCI.1751-24.2025
PMID:40628526
Abstract

Recognizing speech in noise, such as in a busy restaurant, is an essential cognitive skill where the task difficulty varies across environments and noise levels. Although there is growing evidence that the auditory system relies on statistical representations for perceiving and coding natural sounds, it's less clear how statistical cues and neural representations contribute to segregating speech in natural auditory scenes. Here we demonstrate that male and female human listeners rely on mid-level statistics to segregate and recognize speech in environmental noise. Using natural backgrounds and variants with perturbed spectrotemporal statistics, we show that speech recognition accuracy at a fixed noise level varies extensively across natural backgrounds (0% to 100%). Furthermore, for each background the unique interference created by summary statistics can mask or unmask speech, thus hindering or improving speech recognition. To identify the neural coding strategy and statistical cues that influence accuracy, we developed a framework that links summary statistics from a neural model to word recognition accuracy. Whereas summary statistics from a peripheral cochlear model account for only 60% of perceptual variance, summary statistics from a mid-level auditory midbrain model accurately predict single trial sensory judgments, accounting for more than 90% of the perceptual variance. Furthermore, perceptual weights from the regression framework identify which statistics and tuned neural filters are influential and how they impact recognition. Thus, perception of speech in natural backgrounds relies on a mid-level auditory representation involving interference of multiple summary statistics that impact recognition beneficially or detrimentally across natural background sounds. Recognizing speech in natural auditory scenes with competing talkers and environmental noise is a critical cognitive skill. Although normal listeners effortlessly perform this task, for instance in a crowded restaurant, it challenges individuals with hearing loss and our most sophisticated machine systems. We tested human participants listening to speech in natural noises with varied statistical characteristics and demonstrate that they rely on a statistical representation of sounds to segregate speech from environmental noise. Using a model of the auditory system, we then demonstrate that a brain inspired statistical representation of natural sounds accurately predicts human perceptual trends across wide range of natural backgrounds and noise levels and reveals key statistical features and neural computations underlying human abilities for this task.

摘要

在嘈杂环境中识别语音,比如在繁忙的餐厅里,是一项重要的认知技能,其任务难度会因环境和噪音水平的不同而变化。尽管越来越多的证据表明,听觉系统依靠统计表征来感知和编码自然声音,但统计线索和神经表征如何在自然听觉场景中分离语音,却不太清楚。在这里,我们证明了男性和女性人类听众依靠中级统计来在环境噪音中分离和识别语音。使用自然背景以及频谱时间统计受到干扰的变体,我们表明在固定噪音水平下,语音识别准确率在不同自然背景中差异很大(从0%到100%)。此外,对于每种背景,由汇总统计产生的独特干扰会掩盖或揭示语音,从而阻碍或提高语音识别。为了确定影响准确率的神经编码策略和统计线索,我们开发了一个框架,将神经模型的汇总统计与单词识别准确率联系起来。虽然外周耳蜗模型的汇总统计仅占感知方差的60%,但中级听觉中脑模型的汇总统计能准确预测单次试验的感官判断,占感知方差的90%以上。此外,回归框架的感知权重确定了哪些统计和调谐神经滤波器具有影响力,以及它们如何影响识别。因此,在自然背景中对语音的感知依赖于中级听觉表征,该表征涉及多种汇总统计的干扰,这些干扰在自然背景声音中对识别产生有益或有害的影响。在有竞争谈话声和环境噪音的自然听觉场景中识别语音是一项关键的认知技能。尽管正常听众能轻松完成这项任务,比如在拥挤的餐厅里,但这对听力损失者和最先进的机器系统来说却是一项挑战。我们测试了人类参与者在具有不同统计特征的自然噪音中听语音的情况,并证明他们依靠声音的统计表征来将语音与环境噪音分离。然后,我们使用听觉系统模型证明,受大脑启发的自然声音统计表征能准确预测人类在广泛的自然背景和噪音水平下的感知趋势,并揭示了人类完成这项任务能力背后的关键统计特征和神经计算。

相似文献

1
Interference of mid-level speech and noise statistics underlies human speech recognition sensitivity in natural environmental noise.中级语音和噪声统计特性的干扰是自然环境噪声中人类语音识别敏感性的基础。
J Neurosci. 2025 Jul 8. doi: 10.1523/JNEUROSCI.1751-24.2025.
2
Interference of mid-level sound statistics underlie human speech recognition sensitivity in natural noise.中等水平声音统计信息的干扰是自然噪声中人类语音识别敏感性的基础。
bioRxiv. 2024 Oct 4:2024.02.13.579526. doi: 10.1101/2024.02.13.579526.
3
Short-Term Memory Impairment短期记忆障碍
4
No association between idiopathic hidden hearing loss and behavioral adaptation to noise in humans.特发性隐匿性听力损失与人类对噪声的行为适应性之间无关联。
Hear Res. 2025 Aug;464:109321. doi: 10.1016/j.heares.2025.109321. Epub 2025 May 24.
5
Sound degradation type differentially affects neural indicators of cognitive workload and speech tracking.声音退化类型对认知负荷和言语追踪的神经指标有不同影响。
Hear Res. 2025 Aug;464:109303. doi: 10.1016/j.heares.2025.109303. Epub 2025 May 19.
6
Cutting Through the Noise: Noise-Induced Cochlear Synaptopathy and Individual Differences in Speech Understanding Among Listeners With Normal Audiograms.拨开迷雾:正常听力图人群中噪声诱发的耳蜗突触病变与言语理解的个体差异
Ear Hear. 2022 Jan/Feb;43(1):9-22. doi: 10.1097/AUD.0000000000001147.
7
Objective measure of binaural processing: Acoustic change complex in response to interaural phase differences.客观测量双耳处理:对耳间相位差的声变复合反应。
Hear Res. 2024 Jul;448:109020. doi: 10.1016/j.heares.2024.109020. Epub 2024 Apr 28.
8
Spectral weights for localization and speech-in-speech recognition with spatial separation of talkers on the horizontal plane.用于在水平面上对说话者进行空间分离的定位和语音中语音识别的谱权重。
J Acoust Soc Am. 2025 Jul 1;158(1):186-200. doi: 10.1121/10.0037072.
9
Band importance for speech-in-speech recognition in the presence of extended high-frequency cues.在存在扩展高频线索的情况下,语音内语音识别的声道重要性。
J Acoust Soc Am. 2024 Aug 1;156(2):1202-1213. doi: 10.1121/10.0028269.
10
Speech Recognition and Spatial Hearing in Young Adults With Down Syndrome: Relationships With Hearing Thresholds and Auditory Working Memory.唐氏综合征青年的语音识别与空间听觉:与听力阈值和听觉工作记忆的关系。
Ear Hear. 2024;45(6):1568-1584. doi: 10.1097/AUD.0000000000001549. Epub 2024 Aug 2.

本文引用的文献

1
Sensory choices as logistic classification.感觉选择作为逻辑分类。
Neuron. 2024 Sep 4;112(17):2854-2868.e1. doi: 10.1016/j.neuron.2024.06.016. Epub 2024 Jul 15.
2
Dissociable Roles of the Auditory Midbrain and Cortex in Processing the Statistical Features of Natural Sound Textures.听觉中脑和皮层在处理自然声音纹理统计特征中的分离作用。
J Neurosci. 2024 Mar 6;44(10):e1115232023. doi: 10.1523/JNEUROSCI.1115-23.2023.
3
Model metamers reveal divergent invariances between biological and artificial neural networks.模型同型揭示了生物神经网络和人工神经网络之间的不同不变性。
Nat Neurosci. 2023 Nov;26(11):2017-2034. doi: 10.1038/s41593-023-01442-0. Epub 2023 Oct 16.
4
Distinct neural encoding of glimpsed and masked speech in multitalker situations.多说话人情况下瞥见和掩蔽语音的神经编码特征不同。
PLoS Biol. 2023 Jun 6;21(6):e3002128. doi: 10.1371/journal.pbio.3002128. eCollection 2023 Jun.
5
Human-Like Modulation Sensitivity Emerging through Optimization to Natural Sound Recognition.通过优化自然声音识别实现类人调制敏感性。
J Neurosci. 2023 May 24;43(21):3876-3894. doi: 10.1523/JNEUROSCI.2002-22.2023. Epub 2023 Apr 25.
6
A convolutional neural network provides a generalizable model of natural sound coding by neural populations in auditory cortex.卷积神经网络通过听觉皮层中的神经元群体为自然声音编码提供了一个可推广的模型。
PLoS Comput Biol. 2023 May 5;19(5):e1011110. doi: 10.1371/journal.pcbi.1011110. eCollection 2023 May.
7
Two stages of bandwidth scaling drives efficient neural coding of natural sounds.两个带宽扩展阶段促进了自然声音的高效神经编码。
PLoS Comput Biol. 2023 Feb 14;19(2):e1010862. doi: 10.1371/journal.pcbi.1010862. eCollection 2023 Feb.
8
Logistic analysis of choice data: A primer.Logistic 分析在选择数据中的应用:入门指南。
Neuron. 2022 May 18;110(10):1615-1630. doi: 10.1016/j.neuron.2022.03.002. Epub 2022 Mar 24.
9
Sensitivity of neural responses in the inferior colliculus to statistical features of sound textures.下丘脑中神经反应对声音纹理统计特征的敏感性。
Hear Res. 2021 Dec;412:108357. doi: 10.1016/j.heares.2021.108357. Epub 2021 Oct 14.
10
Distinct neural ensemble response statistics are associated with recognition and discrimination of natural sound textures.不同的神经集合反应统计数据与自然声音纹理的识别和区分有关。
Proc Natl Acad Sci U S A. 2020 Dec 8;117(49):31482-31493. doi: 10.1073/pnas.2005644117. Epub 2020 Nov 20.