Dillier N, Bögli H, Spillmann T
Department of Otorhinolaryngology, University Hospital, Zürich, Switzerland.
Scand Audiol Suppl. 1993;38:145-53.
The following processing strategies have been implemented on an experimental laboratory system of a cochlear implant digital speech processor (CIDSP) for the Nucleus 22-channel cochlear prosthesis. The first approach (PES, Pitch Excited Sampler) is based on the classical channel vocoder concept whereby the time-averaged spectral energy of a number of logarithmically spaced frequency bands is transformed into appropriate electrical stimulation parameters for up to 22 electrodes. The pulse rate at any electrode is controlled by the voice pitch of the input speech signal. The pitch extraction algorithm calculates the autocorrelation function of a lowpass-filtered segment of the speech signal and searches for a peak within a specified time window. A random pulse rate of about 150 to 250 Hz is used for unvoiced speech portions. The second approach (CIS, Continuous Interleaved Sampler) uses a stimulation pulse rate which is independent of the input signal. The algorithm scans continuously all specified frequency bands (typically between 4 and 22) and samples their energy levels. Evaluation experiments with 7 experienced cochlear implant users showed significantly better performance in consonant identification tests with the new processing strategies than with the subjects' own wearable speech processors whereas improvements in vowel identification tasks were rarely observed. Modifications of the basic PES- and CIS-strategies resulted in large variations of identification scores. Information transmission analysis of confusion matrices revealed a rather complex pattern across conditions and speech features. No final conclusions can yet be drawn. Optimization and fine-tuning of processing parameters for these coding strategies require more data both from speech identification and discrimination as well as psychophysical experiments.
以下处理策略已在用于Nucleus 22通道人工耳蜗的人工耳蜗数字语音处理器(CIDSP)的实验实验室系统上实施。第一种方法(PES,基音激励采样器)基于经典的声道编码器概念,即多个对数间隔频带的时间平均谱能量被转换为多达22个电极的适当电刺激参数。任何电极处的脉冲率由输入语音信号的音高控制。基音提取算法计算语音信号低通滤波段的自相关函数,并在指定的时间窗口内搜索峰值。对于清音语音部分,使用约150至250 Hz的随机脉冲率。第二种方法(CIS,连续交错采样器)使用与输入信号无关的刺激脉冲率。该算法连续扫描所有指定的频带(通常在4到22之间)并采样它们的能量水平。对7名经验丰富的人工耳蜗使用者进行的评估实验表明,与受试者自己佩戴的语音处理器相比,新的处理策略在辅音识别测试中的表现明显更好,而在元音识别任务中很少观察到改善。对基本PES和CIS策略的修改导致识别分数有很大差异。混淆矩阵的信息传输分析揭示了不同条件和语音特征下相当复杂的模式。尚未得出最终结论。这些编码策略的处理参数的优化和微调需要来自语音识别和辨别以及心理物理学实验的更多数据。