Speech Psychoacoustics Laboratory, Department of Speech and Hearing Science, The Ohio State University, Columbus, Ohio 43210, USA.
J Acoust Soc Am. 2012 Aug;132(2):1078-87. doi: 10.1121/1.4730905.
Speech recognition in noise presumably relies on the number and spectral location of available auditory-filter outputs containing a relatively undistorted view of local target signal properties. The purpose of the present study was to estimate the relative weight of each of the 30 auditory-filter wide bands between 80 and 7563 Hz. Because previous approaches were not compatible with this goal, a technique was developed. Similar to the "hole" approach, the weight of a given band was assessed by comparing intelligibility in two conditions differing in only one aspect-the presence or absence of the band of interest. In contrast to the hole approach, however, random gaps were also created in the spectrum. These gaps were introduced to render the auditory system more sensitive to the removal of a single band and their location was randomized to provide a general view of the weight of each band, i.e., irrespective of the location of information elsewhere in the spectrum. Frequency-weighting functions derived using this technique confirmed the main contribution of the 400-2500 Hz frequency region. However, they revealed a complex microstructure, contrasting with the "bell curve" shape typically reported.
在噪声环境下的语音识别可能依赖于可利用的听觉滤波器输出的数量和频谱位置,这些输出包含了对局部目标信号特性相对未失真的视图。本研究的目的是估计在 80 到 7563 Hz 之间的 30 个听觉滤波器宽带中的每一个的相对权重。由于以前的方法与这一目标不兼容,因此开发了一种技术。类似于“孔”方法,通过比较仅在一个方面存在差异的两种情况下的可理解度来评估给定带宽的权重,即存在或不存在感兴趣的带宽。然而,与孔方法不同,在频谱中也创建了随机间隙。这些间隙的引入使听觉系统对单个带宽的去除更加敏感,并且它们的位置是随机化的,以提供每个带宽的权重的总体视图,即,不考虑频谱中其他位置的信息。使用该技术得出的频率加权函数证实了 400-2500 Hz 频率区域的主要贡献。然而,它们揭示了一种复杂的微观结构,与通常报道的“钟形曲线”形状形成对比。