辅音地标对模拟电声听觉中的语音识别的贡献。

Contribution of consonant landmarks to speech recognition in simulated acoustic-electric hearing.

机构信息

Department of Electrical Engineering, The University of Texas at Dallas, Richardson, Texas, USA.

出版信息

Ear Hear. 2010 Apr;31(2):259-67. doi: 10.1097/AUD.0b013e3181c7db17.

DOI:10.1097/AUD.0b013e3181c7db17

PMID:20081538

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2836394/

Abstract

OBJECTIVES

The purpose of this study is to assess the contribution of information provided by obstruent consonants (e.g., stops and fricatives) to speech intelligibility in simulated acoustic-electric hearing. As a secondary objective, this study examines the performance of an objective measure that can potentially be used for predicting the intelligibility of vocoded speech.

DESIGN

Noise-corrupted sentences are used in experiment 1 in which the noise-corrupted obstruent consonants are replaced with clean obstruent consonants, while leaving the sonorant sounds (vowels, semivowels, and nasals) corrupted. In one condition, listeners have only access to the low-frequency (<600 Hz) acoustic portion of the clean consonant spectra, in other condition, listeners have only access to the higher frequency (>600 Hz) portion (vocoded) of the clean consonant spectra, and in the third condition, they have access to both. In experiment 2, we investigate a speech-coding strategy that selectively attenuates the low-frequency portion of the consonant spectra while leaving the vocoded portion corrupted by noise. Finally, using the data collected from experiments 1 and 2, we evaluate the performance of an objective measure in terms of predicting intelligibility of vocoded speech. This measure was originally designed to predict speech quality and has never been evaluated with vocoded speech.

RESULTS

Significant improvements (about 30 percentage points) in intelligibility were noted in experiment 1 in steady and two-talker masker conditions when the listeners had access to the clean obstruent consonants in both the acoustic and the vocoded portions of the spectrum. The improvement was more evident in the low signal to noise ratio levels (-5 and 0 dB). Further analysis indicated that it was access to the vocoded portion of the consonant spectra, rather than access to the low-frequency acoustic portion of the consonant spectra that contributed the most to the large improvements in performance. In experiment 2, a small (14 percentage points) but statistically significant improvement in performance was obtained at 0 dB signal to noise ratio (steady masker) when the obstruent consonants were selectively attenuated in the low-frequency acoustic portion alone (the vocoded portion was left noise corrupted). The examined objective measure predicted with a relatively high correlation (r = 0.92 to 0.94) [corrected] the intelligibility of vocoded speech improved in both steady and two-talker masking conditions.

CONCLUSIONS

Providing access to the clean obstruent spectra can yield substantial improvements in intelligibility relative to the simulated acoustic-electric condition. Much of this improvement can be attributed to the listeners having access to the clean vocoded portion of the obstruent consonants. The large contribution of obstruent consonants in speech recognition in simulated acoustic-electric hearing stems from the fact that these consonants provide reliable acoustic landmarks which in turn enable listener to integrate effectively pieces of the message glimpsed over temporal gaps into one coherent speech stream. It is argued that these landmarks are smeared in existing cochlear implant systems, including the bimodal systems, owing to envelope compression, and the fact that the obstruent consonants are probably the first to be masked by background noise. Overall, the outcomes from this study suggest that the obstruent consonants need to be treated differently for improved speech recognition in noise.

摘要

目的

本研究旨在评估阻塞音辅音（如闭塞音和摩擦音）提供的信息对模拟电声听力中言语可懂度的贡献。作为次要目标，本研究还检验了一种潜在的可用于预测语音编码言语可懂度的客观测量方法的性能。

设计

在实验 1 中，使用噪声污染的句子，其中噪声污染的阻塞音辅音被替换为干净的阻塞音辅音，而使浊音（元音、半元音和鼻音）保持污染状态。在一种条件下，听者仅能访问干净辅音频谱的低频（<600Hz）部分，在另一种条件下，听者仅能访问干净辅音频谱的高频（>600Hz）部分（语音编码），在第三种条件下，他们可以访问两者。在实验 2 中，我们研究了一种语音编码策略，该策略选择性地衰减辅音频谱的低频部分，而使语音编码部分受噪声污染。最后，使用从实验 1 和实验 2 中收集的数据，我们评估了一种客观测量方法在预测语音编码言语可懂度方面的性能。该方法最初用于预测语音质量，从未与语音编码言语一起进行评估。

结果

在稳定和双说话人掩蔽条件下，实验 1 中，当听者能够访问频谱中的声学和语音编码部分的干净阻塞音辅音时，可懂度显著提高（约 30 个百分点）。在低信噪比水平（-5 和 0dB）下，改善更为明显。进一步分析表明，是对辅音频谱的语音编码部分的访问，而不是对辅音频谱的低频声学部分的访问，对性能的显著提高贡献最大。在实验 2 中，当阻塞音辅音仅在低频声学部分被选择性衰减（语音编码部分保持噪声污染）时，在 0dB 信噪比（稳定掩蔽）下，性能获得了小但具有统计学意义的提高（14 个百分点）。所检查的客观测量方法与较高的相关性（r=0.92 到 0.94）[已纠正]相吻合，预测语音编码言语的可懂度在稳定和双说话人掩蔽条件下都有所提高。

结论

与模拟电声条件相比，提供对干净阻塞音频谱的访问可以显著提高可懂度。这种改善的大部分可以归因于听者能够访问干净的语音编码阻塞音辅音部分。在模拟电声听力中，阻塞音辅音在言语识别中做出了很大的贡献，这是因为它们提供了可靠的声学标记，这反过来又使听者能够有效地将听到的消息片段整合到一个连贯的语音流中。有人认为，这些标记在现有的耳蜗植入系统中（包括双模系统）由于包络压缩而变得模糊，并且阻塞音辅音可能是首先被背景噪声掩蔽的。总的来说，这项研究的结果表明，为了提高噪声中的言语识别能力，需要对阻塞音辅音进行不同的处理。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63af/2836394/d29697b8ccbf/nihms-166805-f0001.jpg

相似文献

Contribution of consonant landmarks to speech recognition in simulated acoustic-electric hearing.辅音地标对模拟电声听觉中的语音识别的贡献。

Ear Hear. 2010 Apr;31(2):259-67. doi: 10.1097/AUD.0b013e3181c7db17.

The contribution of obstruent consonants and acoustic landmarks to speech recognition in noise.塞音辅音和声学标志对噪声环境下语音识别的贡献。

J Acoust Soc Am. 2008 Dec;124(6):3947. doi: 10.1121/1.2997435.

Masking release and the contribution of obstruent consonants on speech recognition in noise by cochlear implant users.人工耳蜗使用者在噪声环境下的语音识别中，掩蔽释放和阻碍辅音的贡献。

J Acoust Soc Am. 2010 Sep;128(3):1262-71. doi: 10.1121/1.3466845.

Factors affecting masking release in cochlear-implant vocoded speech.影响人工耳蜗编码语音中掩蔽释放的因素。

J Acoust Soc Am. 2009 Jul;126(1):338-46. doi: 10.1121/1.3133702.

The effects of selective consonant amplification on sentence recognition in noise by hearing-impaired listeners.选择性辅音增强对听力障碍者在噪声中句子识别的影响。

J Acoust Soc Am. 2011 Nov;130(5):3028-37. doi: 10.1121/1.3641407.

Predicting the intelligibility of vocoded speech.语音编码语音可懂度预测。

Ear Hear. 2011 May-Jun;32(3):331-8. doi: 10.1097/AUD.0b013e3181ff3515.

Phoneme recognition in vocoded maskers by normal-hearing and aided hearing-impaired listeners.正常听力者和助听听力受损者对带通滤波掩蔽声中的音素识别

J Acoust Soc Am. 2014 Aug;136(2):859-66. doi: 10.1121/1.4889863.

Sentence intelligibility during segmental interruption and masking by speech-modulated noise: Effects of age and hearing loss.语音调制噪声分段干扰和掩蔽期间的句子可懂度：年龄和听力损失的影响。

J Acoust Soc Am. 2015 Jun;137(6):3487-501. doi: 10.1121/1.4921603.

Effects of introducing low-frequency harmonics in the perception of vocoded telephone speech.在感知变码电话语音中引入低频谐波的效果。

J Acoust Soc Am. 2010 Sep;128(3):1280-9. doi: 10.1121/1.3463803.

Electric and acoustic harmonic integration predicts speech-in-noise performance in hybrid cochlear implant users.电声谐波整合可预测混合式人工耳蜗使用者在噪声环境中的言语表现。

Hear Res. 2018 Sep;367:223-230. doi: 10.1016/j.heares.2018.06.016. Epub 2018 Jun 28.

引用本文的文献

Recognition of spectrally shaped speech in speech-modulated noise: Effects of age, spectral shape, speech level, and vocoding.语音调制噪声中频谱成形语音的识别：年龄、频谱形状、语音水平和声道编码的影响。

JASA Express Lett. 2023 Apr 1;3(4). doi: 10.1121/10.0017772.

Acoustic voice characteristics with and without wearing a facemask.佩戴和不佩戴口罩时的声学语音特征。

Sci Rep. 2021 Mar 11;11(1):5651. doi: 10.1038/s41598-021-85130-8.

Acoustic richness modulates the neural networks supporting intelligible speech processing.声学丰富度调节支持可理解语音处理的神经网络。

Hear Res. 2016 Mar;333:108-117. doi: 10.1016/j.heares.2015.12.008. Epub 2015 Dec 23.

Psychoacoustic and phoneme identification measures in cochlear-implant and normal-hearing listeners.人工耳蜗植入者和正常听力者的心理声学及音素识别测量

Trends Amplif. 2013 Mar;17(1):27-44. doi: 10.1177/1084713813477244. Epub 2013 Feb 21.

Comparing models of the combined-stimulation advantage for speech recognition.比较语音识别中联合刺激优势的模型。

J Acoust Soc Am. 2012 May;131(5):3970-80. doi: 10.1121/1.3699231.

Fundamental frequency is critical to speech perception in noise in combined acoustic and electric hearing.基频对于在声电联合听觉中噪声环境下的言语感知至关重要。

J Acoust Soc Am. 2011 Oct;130(4):2054-62. doi: 10.1121/1.3631563.

A model-based analysis of the "combined-stimulation advantage".基于模型的“联合刺激优势”分析。

Hear Res. 2011 Dec;282(1-2):252-64. doi: 10.1016/j.heares.2011.06.004. Epub 2011 Jul 27.

Predicting the intelligibility of vocoded and wideband Mandarin Chinese.预测语音编码和宽带普通话的可懂度。

J Acoust Soc Am. 2011 May;129(5):3281-90. doi: 10.1121/1.3570957.

Predicting the intelligibility of vocoded speech.语音编码语音可懂度预测。

Ear Hear. 2011 May-Jun;32(3):331-8. doi: 10.1097/AUD.0b013e3181ff3515.

The intelligibility of noise-vocoded speech: spectral information available from across-channel comparison of amplitude envelopes.噪声编码语音的可懂度：跨通道幅度包络比较可获得的频谱信息。

Proc Biol Sci. 2011 May 22;278(1711):1595-600. doi: 10.1098/rspb.2010.1554. Epub 2010 Nov 10.

本文引用的文献

Factors affecting masking release in cochlear-implant vocoded speech.影响人工耳蜗编码语音中掩蔽释放的因素。

J Acoust Soc Am. 2009 Jul;126(1):338-46. doi: 10.1121/1.3133702.

Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions.基于新的频段重要性函数预测噪声环境下言语可懂度的客观测量方法。

J Acoust Soc Am. 2009 May;125(5):3387-405. doi: 10.1121/1.3097493.

Auditory and auditory-visual intelligibility of speech in fluctuating maskers for normal-hearing and hearing-impaired listeners.正常听力和听力受损听众在波动掩蔽声中语音的听觉及视听清晰度

J Acoust Soc Am. 2009 May;125(5):3358-72. doi: 10.1121/1.3110132.

The contribution of obstruent consonants and acoustic landmarks to speech recognition in noise.塞音辅音和声学标志对噪声环境下语音识别的贡献。

J Acoust Soc Am. 2008 Dec;124(6):3947. doi: 10.1121/1.2997435.

Masking release for low- and high-pass-filtered speech in the presence of noise and single-talker interference.在存在噪声和单说话者干扰的情况下，低通和高通滤波语音的掩蔽释放

J Acoust Soc Am. 2009 Jan;125(1):457-68. doi: 10.1121/1.3021299.

A glimpsing account for the benefit of simulated combined acoustic and electric hearing.关于模拟联合声学和电听觉益处的简要描述。

J Acoust Soc Am. 2008 Apr;123(4):2287-94. doi: 10.1121/1.2839013.

A probabilistic framework for landmark detection based on phonetic features for automatic speech recognition.一种基于语音特征的地标检测概率框架，用于自动语音识别。

J Acoust Soc Am. 2008 Feb;123(2):1154-68. doi: 10.1121/1.2823754.

A comparative intelligibility study of single-microphone noise reduction algorithms.单麦克风降噪算法的可懂度对比研究。

J Acoust Soc Am. 2007 Sep;122(3):1777. doi: 10.1121/1.2766778.

Improved speech recognition in noise in simulated binaurally combined acoustic and electric stimulation.模拟双耳联合声刺激和电刺激时噪声环境下语音识别能力的改善

J Acoust Soc Am. 2007 Jun;121(6):3717-27. doi: 10.1121/1.2717408.

Consonant and vowel confusions in speech-weighted noise.言语加权噪声中的辅音和元音混淆。

J Acoust Soc Am. 2007 Apr;121(4):2312-26. doi: 10.1121/1.2642397.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

辅音地标对模拟电声听觉中的语音识别的贡献。

Contribution of consonant landmarks to speech recognition in simulated acoustic-electric hearing.

机构信息

出版信息

OBJECTIVES

DESIGN

RESULTS

CONCLUSIONS

目的

设计

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献