Drullman R, Festen J M, Plomp R
Department of Otorhinolaryngology, Free University Hospital, Amsterdam, The Netherlands.
J Acoust Soc Am. 1994 May;95(5 Pt 1):2670-80. doi: 10.1121/1.409836.
The effect of reducing low-frequency modulations in the temporal envelope on the speech-reception threshold (SRT) for sentences in noise and on phoneme identification was investigated. For this purpose, speech was split up into a series of frequency bands (1/4, 1/2, or 1 oct wide) and the amplitude envelope for each band was high-pass filtered at cutoff frequencies of 1, 2, 4, 8, 16, 32, 64, or 128 Hz, or infinity (completely flattened). Results for 42 normal-hearing listeners show: (1) A clear reduction in sentence intelligibility with narrow-band processing for cutoff frequencies above 64 Hz; and (2) no reduction of sentence intelligibility when only amplitude variations below 4 Hz are reduced. Based on the modulation transfer function of some conditions, it is concluded that fast multichannel dynamic compression leads to an insignificant change in masked SRT. Combining these results with previous data on low-pass envelope filtering (temporal smearing) [Drullman et al., J. Acoust. Soc. Am. 95, 1053-1064 (1994)] shows that at 8-10 Hz the temporal modulation spectrum is divided into two equally important parts. Vowel and consonant identification with nonsense syllables were studied for cutoff frequencies of 2, 8, 32, 128 Hz, and infinity, processed in 1/4-oct bands. Results for 12 subjects indicate that, just as for low-pass envelope filtering, consonants are more affected than vowels. Errors in vowel identification mainly consist of reduced recognition of diphthongs and of durational confusions. For the consonants there are no clear confusion patterns, but stops appear to suffer least. In most cases, the responses tend to fall into the correct category (stop, fricative, or vowel-like).
研究了降低时间包络中的低频调制对噪声中句子的言语接受阈(SRT)以及音素识别的影响。为此,语音被分割成一系列频带(1/4、1/2或1倍频程宽),每个频带的幅度包络在1、2、4、8、16、32、64或128 Hz的截止频率处进行高通滤波,或截止频率为无穷大(完全平坦)。42名听力正常的受试者的结果表明:(1)对于截止频率高于64 Hz的窄带处理,句子清晰度明显降低;(2)当仅降低低于4 Hz的幅度变化时,句子清晰度没有降低。基于某些条件下的调制传递函数,得出快速多通道动态压缩导致掩蔽SRT的变化不显著的结论。将这些结果与先前关于低通包络滤波(时间模糊)的数据[德鲁尔曼等人,《美国声学学会杂志》95,1053 - 1064(1994)]相结合表明,在8 - 10 Hz时,时间调制谱被分为两个同等重要的部分。研究了在1/4倍频程频带中处理的2、8、32、128 Hz和无穷大截止频率下,用无意义音节进行元音和辅音识别的情况。12名受试者的结果表明,与低通包络滤波一样,辅音比元音更容易受到影响。元音识别中的错误主要包括双元音识别率降低和时长混淆。对于辅音,没有明显的混淆模式,但塞音似乎受影响最小。在大多数情况下,反应倾向于落入正确的类别(塞音、擦音或类元音)。