Dubno Judy R, Horwitz Amy R, Ahlstrom Jayne B
Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, Charleston, SC 29425, USA.
Ear Hear. 2007 Feb;28(1):2-17. doi: 10.1097/AUD.0b013e3180310212.
The aim of this experiment was to assess the contribution of cochlear nonlinearities to speech recognition in noise for individuals with normal hearing and a range of quiet thresholds. For signals close to the characteristic frequency (CF) of a place on the basilar membrane, the normal growth of response of the basilar membrane is linear at lower signal levels and compressed at medium to higher signal levels. In contrast, at moderate to high CFs, the basilar membrane responds more linearly to stimuli at frequencies well below the CF regardless of input level. Thus, for moderate-level speech and a lower frequency masker, the response to the masker grows linearly whereas the response to the speech is compressed, which may result in changes in the effectiveness of the masker on speech recognition with increases in masker level. To test this hypothesis, observed speech-recognition scores were compared with scores predicted using an audibility-based model, which did not include nonlinear effects that may influence masker effectiveness.
Growth of simultaneous masking was measured for moderate-level bandpass-filtered nonsense syllables and for 350-msec pure tones at frequencies within the speech passband. Masker frequencies were within (on-frequency) or below (off-frequency) the speech passband. Estimates of basilar-membrane nonlinearities were derived from growth-of-masking functions for 10-msec, 2.0- and 4.0-kHz tones in narrowband, off-frequency maskers presented simultaneously. Subjects were 26 adults with normal hearing with approximately a 20-dB range of average quiet thresholds.
Breakpoints (i.e., the levels corresponding to the transitions from linear to nonlinear responses) were strongly associated with quiet thresholds but slopes measured above the breakpoints were independent of quiet thresholds. Individual differences were substantially larger for off-frequency masking of pure tones and speech than for on-frequency masking of pure tones and speech. Using an audibility-based predictive model, the change in speech audibility resulting from the compressed response to speech with increasing off-frequency masker level (and the resulting decline in scores) was well predicted from nonlinear growth of masking for pure tones measured in the same off-frequency masker. However, absolute speech-recognition predictions were generally inaccurate and were a function of how well pure-tone signal levels at masked threshold estimated masker effectiveness for speech. That is, subjects with lower off-frequency masked thresholds had less accurate predictions of speech recognition in off-frequency maskers.
Large individual differences in off-frequency masking of pure tones and speech are consistent with the assumption that small changes in the shape of the basilar-membrane input-output function result in large changes in the amount of off-frequency masking but small (if any) changes in on-frequency masking where the signal and masker are subject to a similar compression. Growth of off-frequency masking of pure tones and speech were correlated with each other, consistent with the underlying basilar-membrane response, and consistent with changes in breakpoints for subjects with normal hearing and a range of quiet thresholds. These results provide support for a role of nonlinear effects in the understanding of speech in noise.
本实验的目的是评估耳蜗非线性对听力正常且安静阈值范围各异的个体在噪声中语音识别的贡献。对于接近基底膜上某一位置特征频率(CF)的信号,基底膜反应的正常增长在较低信号水平时呈线性,在中等至高信号水平时呈压缩状态。相比之下,在中等至高CF时,无论输入水平如何,基底膜对远低于CF频率的刺激反应更呈线性。因此,对于中等水平的语音和较低频率的掩蔽音,对掩蔽音的反应呈线性增长,而对语音的反应则被压缩,这可能导致随着掩蔽音水平增加,掩蔽音对语音识别的有效性发生变化。为验证这一假设,将观察到的语音识别分数与使用基于可听度的模型预测的分数进行比较,该模型未包括可能影响掩蔽音有效性的非线性效应。
测量了中等水平带通滤波无意义音节以及语音通带内频率的350毫秒纯音的同时掩蔽增长情况。掩蔽音频率在语音通带内(同频)或低于(异频)语音通带。基底膜非线性的估计值来自于同时呈现的窄带、异频掩蔽音中10毫秒、2.0千赫和4.0千赫纯音的掩蔽增长函数。受试者为26名听力正常的成年人,平均安静阈值范围约为20分贝。
断点(即对应于从线性反应到非线性反应转变的水平)与安静阈值密切相关,但在断点之上测量的斜率与安静阈值无关。对于纯音和语音的异频掩蔽,个体差异比对纯音和语音的同频掩蔽大得多。使用基于可听度的预测模型,根据在相同异频掩蔽音中测量的纯音掩蔽非线性增长情况,可以很好地预测随着异频掩蔽音水平增加,对语音的压缩反应导致的语音可听度变化(以及由此导致的分数下降)。然而,绝对语音识别预测通常不准确,并且是在掩蔽阈值下纯音信号水平对语音掩蔽音有效性估计程度的函数。也就是说,异频掩蔽阈值较低的受试者对异频掩蔽音中语音识别的预测不太准确。
纯音和语音异频掩蔽中存在较大个体差异,这与以下假设一致:基底膜输入 - 输出函数形状的微小变化会导致异频掩蔽量的大幅变化,但在信号和掩蔽音受到类似压缩的同频掩蔽中变化很小(如果有变化的话)。纯音和语音的异频掩蔽增长相互关联,与潜在的基底膜反应一致,也与听力正常且安静阈值范围各异的受试者断点变化一致。这些结果为非线性效应在理解噪声中语音方面的作用提供了支持。