Fu Q J, Galvin J J
Department of Auditory Implants and Perception, House Ear Institute, Los Angeles, California 90057, USA.
J Acoust Soc Am. 2001 Mar;109(3):1166-72. doi: 10.1121/1.1344158.
This experiment examined the effects of spectral resolution and fine spectral structure on recognition of spectrally asynchronous sentences by normal-hearing and cochlear implant listeners. Sentence recognition was measured in six normal-hearing subjects listening to either full-spectrum or noise-band processors and five Nucleus-22 cochlear implant listeners fitted with 4-channel continuous interleaved sampling (CIS) processors. For the full-spectrum processor, the speech signals were divided into either 4 or 16 channels. For the noise-band processor, after band-pass filtering into 4 or 16 channels, the envelope of each channel was extracted and used to modulate noise of the same bandwidth as the analysis band, thus eliminating the fine spectral structure available in the full-spectrum processor. For the 4-channel CIS processor, the amplitude envelopes extracted from four bands were transformed to electric currents by a power function and the resulting electric currents were used to modulate pulse trains delivered to four electrode pairs. For all processors, the output of each channel was time-shifted relative to other channels, varying the channel delay across channels from 0 to 240 ms (in 40-ms steps). Within each delay condition, all channels were desynchronized such that the cross-channel delays between adjacent channels were maximized, thereby avoiding local pockets of channel synchrony. Results show no significant difference between the 4- and 16-channel full-spectrum speech processor for normal-hearing listeners. Recognition scores dropped significantly only when the maximum delay reached 200 ms for the 4-channel processor and 240 ms for the 16-channel processor. When fine spectral structures were removed in the noise-band processor, sentence recognition dropped significantly when the maximum delay was 160 ms for the 16-channel noise-band processor and 40 ms for the 4-channel noise-band processor. There was no significant difference between implant listeners using the 4-channel CIS processor and normal-hearing listeners using the 4-channel noise-band processor. The results imply that when fine spectral structures are not available, as in the implant listener's case, increased spectral resolution is important for overcoming cross-channel asynchrony in speech signals.
本实验研究了频谱分辨率和精细频谱结构对正常听力者和人工耳蜗使用者识别频谱异步句子的影响。对6名使用全频谱或窄带处理器的正常听力受试者以及5名佩戴4通道连续交错采样(CIS)处理器的Nucleus-22型人工耳蜗使用者进行了句子识别测试。对于全频谱处理器,语音信号被分为4或16个通道。对于窄带处理器,在带通滤波到4或16个通道后,提取每个通道的包络,并用于调制与分析频段带宽相同的噪声,从而消除了全频谱处理器中可用的精细频谱结构。对于4通道CIS处理器,从四个频段提取的幅度包络通过幂函数转换为电流,所得电流用于调制输送到四对电极的脉冲序列。对于所有处理器,每个通道的输出相对于其他通道进行了时间偏移,通道延迟在各通道之间从0到240毫秒变化(以40毫秒为步长)。在每个延迟条件下,所有通道都不同步,使得相邻通道之间的跨通道延迟最大化,从而避免通道同步的局部区域。结果表明,对于正常听力的受试者,4通道和16通道全频谱语音处理器之间没有显著差异。只有当4通道处理器的最大延迟达到200毫秒、16通道处理器的最大延迟达到240毫秒时,识别分数才会显著下降。当窄带处理器中去除精细频谱结构时,16通道窄带处理器的最大延迟为160毫秒、4通道窄带处理器的最大延迟为40毫秒时,句子识别显著下降。使用4通道CIS处理器的人工耳蜗使用者与使用4通道窄带处理器的正常听力受试者之间没有显著差异。结果表明,在人工耳蜗使用者的情况下,当没有精细频谱结构时,提高频谱分辨率对于克服语音信号中的跨通道异步很重要。