Zhang C, Zeng F G
Department of Auditory Implants and Perception, House Ear Institute, Los Angeles, California 90057, USA.
J Acoust Soc Am. 1997 Nov;102(5 Pt 1):2925-34. doi: 10.1121/1.420347.
Traditional loudness models have been based on the average energy and the critical band analysis of steady-state sounds. However, most environmental sounds, including speech, are dynamic stimuli, in which the average level [e.g., the root-mean-square (rms) level] does not account for the large temporal fluctuations. The question addressed here was whether two stimuli of the same rms level but different peak levels would produce an equal loudness sensation. A modern adaptive procedure was used to replicate two classic experiments demonstrating that the sensation of "beats" in a two- or three-tone complex resulted in a louder sensation [E. Zwicker and H. Fastl, Psychoacoustics-Facts and Models (Springer-Verlag, Berlin, 1990)]. Two additional experiments were conducted to study exclusively the effects of the temporal envelope on the loudness sensation of dynamic stimuli. Loudness balance was performed by normal-hearing listeners between a white noise and a sinusoidally amplitude-modulated noise in one experiment, and by cochlear implant listeners between two harmonic stimuli of the same magnitude spectra, but different phase spectra, in the other experiment. The results from both experiments showed that, for two stimuli of the same rms level, the stimulus with greater temporal fluctuations sometimes produced a significantly louder sensation, depending on the temporal frequency and overall stimulus level. In normal-hearing listeners, the louder sensation was produced for the amplitude-modulated stimuli with modulation frequencies lower than 400 Hz, and gradually disappeared above 400 Hz, resulting in a low-pass filtering characteristic which bore some similarity to the temporal modulation transfer function. The extent to which loudness was greater was a nonmonotonic function of level in acoustic hearing and a monotonically increasingly function in electric hearing. These results suggest that the loudness sensation of a dynamic stimulus is not limited to a 100-ms temporal integration process, and may be determined jointly by a compression process in the cochlea and an expansion process in the brain. A level-dependent compression scheme that may better restore normal loudness of dynamic stimuli in hearing aids and cochlear implants is proposed.
传统的响度模型基于稳态声音的平均能量和临界频带分析。然而,包括语音在内的大多数环境声音都是动态刺激,其中平均电平[例如均方根(rms)电平]并不能解释大的时间波动。这里要解决的问题是,两个均方根电平相同但峰值电平不同的刺激是否会产生相等的响度感觉。采用了一种现代自适应程序来重复两个经典实验,这些实验表明,两音或三音复合音中的“拍”感觉会导致更响亮的感觉[E. 茨维克和H. 法斯特尔,《心理声学——事实与模型》(施普林格出版社,柏林,1990年)]。还进行了另外两个实验,专门研究时间包络对动态刺激响度感觉的影响。在一个实验中,听力正常的听众对白噪声和正弦幅度调制噪声进行响度平衡;在另一个实验中,人工耳蜗使用者对两个幅度谱相同但相位谱不同的谐波刺激进行响度平衡。两个实验的结果均表明,对于均方根电平相同的两个刺激,时间波动较大的刺激有时会产生明显更响亮的感觉,这取决于时间频率和整体刺激电平。在听力正常的听众中,调制频率低于400 Hz的幅度调制刺激会产生更响亮的感觉,在400 Hz以上逐渐消失,从而产生一种与时间调制传递函数有一定相似性的低通滤波特性。响度增加的程度在声学听力中是电平的非单调函数,在电听力中是单调递增函数。这些结果表明,动态刺激的响度感觉不限于100毫秒的时间整合过程,可能由耳蜗中的压缩过程和大脑中的扩展过程共同决定。提出了一种电平依赖的压缩方案,该方案可能更好地恢复助听器和人工耳蜗中动态刺激的正常响度。