现实噪声中的语音识别与噪声适应

Speech Recognition and Noise Adaptation in Realistic Noises.

作者信息

Marrufo-Pérez Miriam I, Lopez-Poveda Enrique A

机构信息

Instituto de Neurociencias de Castilla y León (INCYL), Universidad de Salamanca, Salamanca, Spain.

Instituto de Investigación Biomédica de Salamanca (IBSAL), Universidad de Salamanca, Salamanca, Spain.

出版信息

Trends Hear. 2025 Jan-Dec;29:23312165251343457. doi: 10.1177/23312165251343457. Epub 2025 May 15.

DOI:10.1177/23312165251343457

PMID:40370075

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12081978/

Abstract

The recognition of isolated words in noise improves as words are delayed from the noise onset. This phenomenon, known as adaptation to noise, has been mostly investigated using synthetic noises. The aim here was to investigate whether adaptation occurs for realistic noises and to what extent it depends on the spectrum and level fluctuations of the noise. Forty-nine different realistic and synthetic noises were analyzed and classified according to how much they fluctuated in level over time and how much their spectra differed from the speech spectrum. Six representative noises were chosen that covered the observed range of level fluctuations and spectral differences but could still mask speech. For the six noises, speech reception thresholds (SRTs) were measured for natural and tone-vocoded words delayed 50 (early condition) and 800 ms (late condition) from the noise onset. Adaptation was calculated as the SRT improvement in the late relative to the early condition. Twenty-two adults with normal hearing participated in the experiments. For natural words, adaptation was small overall (mean = 0.5 dB) and similar across the six noises. For vocoded words, significant adaptation occurred for all six noises (mean = 1.3 dB) and was not statistically different across noises. For the tested noises, the amount of adaptation was independent of the spectrum and level fluctuations of the noise. The results suggest that adaptation in speech recognition can occur in realistic noisy environments.

摘要

随着单词相对于噪声起始时间的延迟，在噪声中对孤立单词的识别能力会提高。这种现象，即所谓的噪声适应，大多是使用合成噪声进行研究的。本文的目的是研究在现实噪声环境中是否会发生适应现象，以及适应程度在多大程度上取决于噪声的频谱和电平波动。分析了49种不同的现实噪声和合成噪声，并根据它们随时间的电平波动程度以及它们的频谱与语音频谱的差异程度进行分类。选择了六种具有代表性的噪声，它们涵盖了观察到的电平波动范围和频谱差异，但仍能掩蔽语音。对于这六种噪声，测量了自然词和声调编码词在相对于噪声起始延迟50毫秒（早期条件）和800毫秒（晚期条件）时的言语接受阈值（SRT）。适应程度通过晚期条件相对于早期条件下SRT的改善来计算。22名听力正常的成年人参与了实验。对于自然词，总体适应程度较小（平均值 = 0.5分贝），且在六种噪声中相似。对于编码词，所有六种噪声都出现了显著的适应（平均值 = 1.3分贝），且不同噪声之间在统计学上没有差异。对于测试的噪声，适应程度与噪声的频谱和电平波动无关。结果表明，在现实嘈杂环境中，语音识别的适应现象是可以发生的。

相似文献

Speech Recognition and Noise Adaptation in Realistic Noises.

Trends Hear. 2025 Jan-Dec;29:23312165251343457. doi: 10.1177/23312165251343457. Epub 2025 May 15.

Adaptation to Noise in Spectrotemporal Modulation Detection and Word Recognition.

Trends Hear. 2024 Jan-Dec;28:23312165241266322. doi: 10.1177/23312165241266322.

Influence of noise type on speech reception thresholds across four languages measured with matrix sentence tests.

Int J Audiol. 2015;54 Suppl 2:62-70. doi: 10.3109/14992027.2015.1046502. Epub 2015 Jun 22.

The extended speech reception threshold model: Predicting speech intelligibility in different types of non-stationary noise in hearing-impaired listeners.

J Acoust Soc Am. 2025 Feb 1;157(2):1500-1511. doi: 10.1121/10.0035833.

Prediction of the intelligibility for speech in real-life background noises for subjects with normal hearing.

Ear Hear. 2008 Apr;29(2):169-75. doi: 10.1097/AUD.0b013e31816476d4.

Comparing approaches for predicting behavioural speech-in-noise performance using cortical responses to unattended stimuli.

Hear Res. 2025 Mar;457:109197. doi: 10.1016/j.heares.2025.109197. Epub 2025 Jan 15.

Adaptation of the STARR test for adult Italian population: A speech test for a realistic estimate in real-life listening conditions.

Int J Audiol. 2016;55(4):262-7. doi: 10.3109/14992027.2015.1124296. Epub 2016 Jan 21.

Large-scale training to increase speech intelligibility for hearing-impaired listeners in novel noises.

J Acoust Soc Am. 2016 May;139(5):2604. doi: 10.1121/1.4948445.

Impaired noise adaptation contributes to speech intelligibility problems in people with hearing loss.

Sci Rep. 2024 Nov 20;14(1):28807. doi: 10.1038/s41598-024-80131-9.

Evaluating the role of age on speech-in-noise perception based primarily on temporal envelope information.

Hear Res. 2025 May;460:109236. doi: 10.1016/j.heares.2025.109236. Epub 2025 Mar 7.

本文引用的文献

Noise schemas aid hearing in noise.

Proc Natl Acad Sci U S A. 2024 Nov 19;121(47):e2408995121. doi: 10.1073/pnas.2408995121. Epub 2024 Nov 15.

Adaptation to Noise in Spectrotemporal Modulation Detection and Word Recognition.

Trends Hear. 2024 Jan-Dec;28:23312165241266322. doi: 10.1177/23312165241266322.

Adaptation to noise in normal and impaired hearing.

J Acoust Soc Am. 2022 Mar;151(3):1741. doi: 10.1121/10.0009802.

Listening in complex acoustic scenes.

Curr Opin Physiol. 2020 Dec;18:63-72. doi: 10.1016/j.cophys.2020.09.001. Epub 2020 Sep 8.

Adaptation to Noise in Human Speech Recognition Depends on Noise-Level Statistics and Fast Dynamic-Range Compression.

J Neurosci. 2020 Aug 19;40(34):6613-6623. doi: 10.1523/JNEUROSCI.0469-20.2020. Epub 2020 Jul 17.

Adaptation of the human auditory cortex to changing background noise.

Nat Commun. 2019 Jun 7;10(1):2509. doi: 10.1038/s41467-019-10611-4.

Adaptation to Noise in Human Speech Recognition Unrelated to the Medial Olivocochlear Reflex.

J Neurosci. 2018 Apr 25;38(17):4138-4145. doi: 10.1523/JNEUROSCI.0024-18.2018. Epub 2018 Mar 28.

Temporal modulations in speech and music.

Neurosci Biobehav Rev. 2017 Oct;81(Pt B):181-187. doi: 10.1016/j.neubiorev.2017.02.011. Epub 2017 Feb 14.

Detecting and representing predictable structure during auditory scene analysis.

Elife. 2016 Sep 7;5:e19113. doi: 10.7554/eLife.19113.

Does the degree of linguistic experience (native versus nonnative) modulate the degree to which listeners can benefit from a delay between the onset of the maskers and the onset of the target speech?

Hear Res. 2016 Nov;341:9-18. doi: 10.1016/j.heares.2016.07.016. Epub 2016 Aug 2.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

现实噪声中的语音识别与噪声适应

Speech Recognition and Noise Adaptation in Realistic Noises.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献