Arehart Kathryn, Souza Pamela, Kates James, Lunner Thomas, Pedersen Michael Syskind
1Speech Language and Hearing Sciences, University of Colorado Boulder, Boulder, CO, USA; 2Communication Sciences and Disorders and Knowles Hearing Center, Northwestern University, Evanston, IL, USA; 3Eriksholm Research Centre, Oticon A/S, Snekkersten, Denmark; 4Linnaeus Centre HEAD, Department of Behavioural Sciences and Learning, Linköping University, Linköping, Sweden; and 5Oticon A/S, Smørum, Denmark.
Ear Hear. 2015 Sep-Oct;36(5):505-16. doi: 10.1097/AUD.0000000000000173.
This study considered speech modified by additive babble combined with noise-suppression processing. The purpose was to determine the relative importance of the signal modifications, individual peripheral hearing loss, and individual cognitive capacity on speech intelligibility and speech quality.
The participant group consisted of 31 individuals with moderate high-frequency hearing loss ranging in age from 51 to 89 years (mean = 69.6 years). Speech intelligibility and speech quality were measured using low-context sentences presented in babble at several signal-to-noise ratios. Speech stimuli were processed with a binary mask noise-suppression strategy with systematic manipulations of two parameters (error rate and attenuation values). The cumulative effects of signal modification produced by babble and signal processing were quantified using an envelope-distortion metric. Working memory capacity was assessed with a reading span test. Analysis of variance was used to determine the effects of signal processing parameters on perceptual scores. Hierarchical linear modeling was used to determine the role of degree of hearing loss and working memory capacity in individual listener response to the processed noisy speech. The model also considered improvements in envelope fidelity caused by the binary mask and the degradations to envelope caused by error and noise.
The participants showed significant benefits in terms of intelligibility scores and quality ratings for noisy speech processed by the ideal binary mask noise-suppression strategy. This benefit was observed across a range of signal-to-noise ratios and persisted when up to a 30% error rate was introduced into the processing. Average intelligibility scores and average quality ratings were well predicted by an objective metric of envelope fidelity. Degree of hearing loss and working memory capacity were significant factors in explaining individual listener's intelligibility scores for binary mask processing applied to speech in babble. Degree of hearing loss and working memory capacity did not predict listeners' quality ratings.
The results indicate that envelope fidelity is a primary factor in determining the combined effects of noise and binary mask processing for intelligibility and quality of speech presented in babble noise. Degree of hearing loss and working memory capacity are significant factors in explaining variability in listeners' speech intelligibility scores but not in quality ratings.
本研究探讨了添加了混叠音并结合噪声抑制处理后的语音。目的是确定信号修改、个体外周听力损失和个体认知能力对语音清晰度和语音质量的相对重要性。
参与者组由31名年龄在51至89岁(平均 = 69.6岁)的中度高频听力损失患者组成。使用在几种信噪比下的混叠音中呈现的低语境句子来测量语音清晰度和语音质量。语音刺激采用二元掩码噪声抑制策略进行处理,并对两个参数(错误率和衰减值)进行系统操纵。使用包络失真度量来量化由混叠音和信号处理产生的信号修改的累积效应。通过阅读广度测试评估工作记忆容量。方差分析用于确定信号处理参数对感知分数的影响。分层线性建模用于确定听力损失程度和工作记忆容量在个体听众对处理后的噪声语音的反应中的作用。该模型还考虑了二元掩码引起的包络保真度的提高以及错误和噪声引起的包络退化。
参与者在由理想二元掩码噪声抑制策略处理的噪声语音的清晰度得分和质量评级方面表现出显著益处。在一系列信噪比下都观察到了这种益处,并且当处理中引入高达30%的错误率时仍然存在。包络保真度的客观度量能够很好地预测平均清晰度得分和平均质量评级。听力损失程度和工作记忆容量是解释应用于混叠音中语音的二元掩码处理的个体听众清晰度得分的重要因素。听力损失程度和工作记忆容量无法预测听众的质量评级。
结果表明,包络保真度是决定噪声和二元掩码处理对混叠音中呈现的语音的清晰度和质量的综合影响的主要因素。听力损失程度和工作记忆容量是解释听众语音清晰度得分变异性的重要因素,但不是质量评级的重要因素。