塞音辅音和声学标志对噪声环境下语音识别的贡献。

The contribution of obstruent consonants and acoustic landmarks to speech recognition in noise.

作者信息

Li Ning, Loizou Philipos C

机构信息

Department of Electrical Engineering, University of Texas at Dallas, Richardson, Texas 75083-0688, USA.

出版信息

J Acoust Soc Am. 2008 Dec;124(6):3947. doi: 10.1121/1.2997435.

DOI:10.1121/1.2997435

PMID:19206819

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2676629/

Abstract

The obstruent consonants (e.g., stops) are more susceptible to noise than vowels, raising the question whether the degradation of speech intelligibility in noise can be attributed, at least partially, to the loss of information carried by obstruent consonants. Experiment 1 assesses the contribution of obstruent consonants to speech recognition in noise by presenting sentences containing clean obstruent consonants but noise-corrupted voiced sounds (e.g., vowels). Results indicated substantial (threefold) improvement in speech recognition, particularly at low signal-to-noise ratio levels (-5 dB). Experiment 2 assessed the importance of providing partial information, within a frequency region, of the obstruent-consonant spectra while leaving the remaining spectral region unaltered (i.e., noise corrupted). Access to the low-frequency (0-1000 Hz) region of the clean obstruent-consonant spectra was found to be sufficient to realize significant improvements in performance and that was attributed to improvement in transmission of voicing information. The outcomes from the two experiments suggest that much of the improvement in performance must be due to the enhanced access to acoustic landmarks, evident in spectral discontinuities signaling the onsets of obstruent consonants. These landmarks, often blurred in noisy conditions, are critically important for understanding speech in noise for better determination of the syllable structure and word boundaries.

摘要

塞音（如爆破音）比元音更容易受到噪声的影响，这就引发了一个问题：在噪声环境中语音清晰度的下降是否至少部分可归因于塞音所携带信息的丢失。实验1通过呈现包含清晰塞音但有声语音（如元音）被噪声干扰的句子，评估了塞音对噪声环境中语音识别的贡献。结果表明，语音识别有显著（三倍）改善，尤其是在低信噪比水平（-5分贝）时。实验2评估了在一个频率区域内提供塞音频谱的部分信息，而让其余频谱区域保持不变（即被噪声干扰）的重要性。研究发现，获取清晰塞音频谱的低频（0 - 1000赫兹）区域就足以实现性能的显著提升，这归因于发声信息传递的改善。这两个实验的结果表明，性能的大部分提升必定归因于对声学标记的更好获取，声学标记在表明塞音起始的频谱不连续中很明显。这些标记在嘈杂条件下常常模糊不清，对于理解噪声中的语音以更好地确定音节结构和单词边界至关重要。

相似文献

The contribution of obstruent consonants and acoustic landmarks to speech recognition in noise.塞音辅音和声学标志对噪声环境下语音识别的贡献。

J Acoust Soc Am. 2008 Dec;124(6):3947. doi: 10.1121/1.2997435.

Contribution of consonant landmarks to speech recognition in simulated acoustic-electric hearing.辅音地标对模拟电声听觉中的语音识别的贡献。

Ear Hear. 2010 Apr;31(2):259-67. doi: 10.1097/AUD.0b013e3181c7db17.

Masking release and the contribution of obstruent consonants on speech recognition in noise by cochlear implant users.人工耳蜗使用者在噪声环境下的语音识别中，掩蔽释放和阻碍辅音的贡献。

J Acoust Soc Am. 2010 Sep;128(3):1262-71. doi: 10.1121/1.3466845.

The effects of selective consonant amplification on sentence recognition in noise by hearing-impaired listeners.选择性辅音增强对听力障碍者在噪声中句子识别的影响。

J Acoust Soc Am. 2011 Nov;130(5):3028-37. doi: 10.1121/1.3641407.

Factors affecting masking release in cochlear-implant vocoded speech.影响人工耳蜗编码语音中掩蔽释放的因素。

J Acoust Soc Am. 2009 Jul;126(1):338-46. doi: 10.1121/1.3133702.

Sentence intelligibility during segmental interruption and masking by speech-modulated noise: Effects of age and hearing loss.语音调制噪声分段干扰和掩蔽期间的句子可懂度：年龄和听力损失的影响。

J Acoust Soc Am. 2015 Jun;137(6):3487-501. doi: 10.1121/1.4921603.

The relative importance of consonant and vowel segments to the recognition of words and sentences: effects of age and hearing loss.辅音和元音段对单词和句子识别的相对重要性：年龄和听力损失的影响。

J Acoust Soc Am. 2012 Sep;132(3):1667-78. doi: 10.1121/1.4739463.

Children's recognition of American English consonants in noise.儿童在噪声中对美式英语辅音的识别。

J Acoust Soc Am. 2010 May;127(5):3177-88. doi: 10.1121/1.3377080.

Assessing the perceptual contributions of vowels and consonants to Mandarin sentence intelligibility.评估元音和辅音对普通话句子可懂度的感知贡献。

J Acoust Soc Am. 2013 Aug;134(2):EL178-84. doi: 10.1121/1.4812820.

Contributions of cochlea-scaled entropy and consonant-vowel boundaries to prediction of speech intelligibility in noise.耳蜗标度熵和元音-辅音边界对噪声下言语可懂度预测的贡献。

J Acoust Soc Am. 2012 May;131(5):4104-13. doi: 10.1121/1.3695401.

引用本文的文献

Temporal fine structure influences voicing confusions for consonant identification in multi-talker babble.时频结构对多说话人噪声环境下辅音识别的浊音混淆有影响。

J Acoust Soc Am. 2021 Oct;150(4):2664. doi: 10.1121/10.0006527.

The effect of increased channel interaction on speech perception with cochlear implants.通道间相互作用增加对人工耳蜗语音感知的影响。

Sci Rep. 2021 May 17;11(1):10383. doi: 10.1038/s41598-021-89932-8.

Attention selectively modulates cortical entrainment in different regions of the speech spectrum.注意力选择性地调节语音频谱不同区域的皮层同步。

Brain Res. 2016 Aug 1;1644:203-12. doi: 10.1016/j.brainres.2016.05.029. Epub 2016 May 16.

Evaluation of a spectral subtraction strategy to suppress reverberant energy in cochlear implant devices.评估一种用于抑制人工耳蜗装置中混响能量的谱减法策略。

J Acoust Soc Am. 2015 Jul;138(1):115-24. doi: 10.1121/1.4922331.

J Acoust Soc Am. 2015 Jun;137(6):3487-501. doi: 10.1121/1.4921603.

Factors constraining the benefit to speech understanding of combining information from low-frequency hearing and a cochlear implant.限制低频听力与人工耳蜗信息结合对言语理解益处的因素。

Hear Res. 2015 Apr;322:107-11. doi: 10.1016/j.heares.2014.09.010. Epub 2014 Oct 5.

Psychoacoustic and phoneme identification measures in cochlear-implant and normal-hearing listeners.人工耳蜗植入者和正常听力者的心理声学及音素识别测量

Trends Amplif. 2013 Mar;17(1):27-44. doi: 10.1177/1084713813477244. Epub 2013 Feb 21.

Effects of age and hearing loss on the relationship between discrimination of stochastic frequency modulation and speech perception.年龄和听力损失对随机调频辨别与言语感知关系的影响。

Ear Hear. 2012 Nov-Dec;33(6):709-20. doi: 10.1097/AUD.0b013e31825aab15.

J Acoust Soc Am. 2012 May;131(5):4104-13. doi: 10.1121/1.3695401.

Tackling the combined effects of reverberation and masking noise using ideal channel selection.利用理想信道选择解决混响和掩蔽噪声的综合影响。

J Speech Lang Hear Res. 2012 Apr;55(2):500-10. doi: 10.1044/1092-4388(2011/11-0073). Epub 2012 Jan 9.

本文引用的文献

A probabilistic framework for landmark detection based on phonetic features for automatic speech recognition.一种基于语音特征的地标检测概率框架，用于自动语音识别。

J Acoust Soc Am. 2008 Feb;123(2):1154-68. doi: 10.1121/1.2823754.

Contribution of consonant versus vowel information to sentence intelligibility for young normal-hearing and elderly hearing-impaired listeners.辅音与元音信息对年轻听力正常和老年听力受损听众句子可懂度的贡献。

J Acoust Soc Am. 2007 Oct;122(4):2365-75. doi: 10.1121/1.2773986.

Consonant and vowel confusions in speech-weighted noise.言语加权噪声中的辅音和元音混淆。

J Acoust Soc Am. 2007 Apr;121(4):2312-26. doi: 10.1121/1.2642397.

The relative roles of vowels and consonants in discriminating talker identity versus word meaning.元音和辅音在区分说话者身份与词义方面的相对作用。

J Acoust Soc Am. 2006 Mar;119(3):1727-39. doi: 10.1121/1.2161431.

The influence of noise on vowel and consonant cues.噪声对元音和辅音线索的影响。

J Acoust Soc Am. 2005 Dec;118(6):3874-88. doi: 10.1121/1.2118407.

Analysis of speech-based Speech Transmission Index methods with implications for nonlinear operations.基于语音的语音传输指数方法分析及其对非线性操作的影响。

J Acoust Soc Am. 2004 Dec;116(6):3679-89. doi: 10.1121/1.1804628.

The role of selected stimulus-variables in the perception of the unvoiced stop consonants.特定刺激变量在清塞音感知中的作用。

Am J Psychol. 1952 Oct;65(4):497-516.

Toward a model for lexical access based on acoustic landmarks and distinctive features.迈向基于声学地标和区别性特征的词汇通达模型。

J Acoust Soc Am. 2002 Apr;111(4):1872-91. doi: 10.1121/1.1458026.

Effect of stimulus bandwidth on the perception of /s/ in normal- and hearing-impaired children and adults.刺激带宽对正常及听力受损儿童和成人/s/音感知的影响。

J Acoust Soc Am. 2001 Oct;110(4):2183-90. doi: 10.1121/1.1400757.

Consonant recordings for speech testing.用于言语测试的辅音录音。

J Acoust Soc Am. 1999 Dec;106(6):L71-4. doi: 10.1121/1.428150.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验