Fu Q J, Shannon R V
Department of Auditory Implants and Perception, House Ear Institute, Los Angeles, California 90057, USA.
Ear Hear. 2000 Jun;21(3):227-35. doi: 10.1097/00003446-200006000-00006.
To determine the consequences for phoneme recognition of errors in setting threshold and loudness levels in cochlear implant listeners using a 4-channel continuous interleaved sampling (CIS) speech processor.
Three Nucleus-22 cochlear implant listeners, who normally used the SPEAK speech processing strategy participated in this study. An experimental 4-channel CIS speech processor was implemented in each listener as follows. Speech signals were band-pass filtered into four broad frequency bands and the temporal envelope of the signal in each band was extracted by half-wave rectification and low-pass filtering. A power function was used to convert the extracted acoustic amplitudes to electric currents. The electric currents were dependent on the exponent of the mapping power function and the electrode dynamic range, which was determined by the minimum and maximum stimulation levels. In the baseline condition, the minimum and maximum stimulation levels were defined as the psychophysically measured threshold level (T-level) and maximum comfortable level (C-level). In the experimental conditions, the maximum stimulation levels were fixed at the C-level and the dynamic range (in dB) was changed by varying the minimum stimulation levels on all electrodes. This manipulation simulates the effect of an erroneous measurement of the T-level. Phoneme recognition was obtained as the dynamic range of electrodes was changed from 1 dB to 20 dB and as the exponent of the power-law amplitude mapping function was changed from 0.1 to 0.4.
For each mapping condition, the electric dynamic range had a significant, but weak effect on vowel and consonant recognition. For a strong compression (p = 0.1), best vowel and consonant scores were obtained with a large dynamic range (12 dB). When the exponent of the mapping function was changed to 0.2 and 0.4, the dynamic range producing the highest scores decreased to 6 dB and 3 dB, respectively.
Phoneme recognition with a 4-channel CIS strategy was only mildly affected by large changes in both electric threshold and loudness mapping. Errors in threshold by a factor of 2 (6 dB) and in the loudness mapping exponent by a factor of 2 were required to produce a significant decrease in performance. In these extreme conditions, the effect of the electric dynamic range on phoneme recognition could be due to two independent factors: abnormal loudness growth and a reduction in the number of discriminable intensity steps. The decrease in performance caused by a reduced electric dynamic range can be compensated by a more expansive power-law mapping function, as long as the number of discriminable intensity steps is moderately large (e.g., >8).
使用4通道连续交错采样(CIS)言语处理器,确定人工耳蜗使用者在设置阈值和响度水平时出现错误对音素识别的影响。
三名通常使用SPEAK言语处理策略的Nucleus-22型人工耳蜗使用者参与了本研究。在每位使用者中实施了一个实验性4通道CIS言语处理器,具体如下。语音信号被带通滤波为四个宽频带,每个频带中信号的时间包络通过半波整流和低通滤波提取。使用幂函数将提取的声学幅度转换为电流。电流取决于映射幂函数的指数和电极动态范围,电极动态范围由最小和最大刺激水平确定。在基线条件下,最小和最大刺激水平被定义为心理物理学测量的阈值水平(T水平)和最大舒适水平(C水平)。在实验条件下,最大刺激水平固定在C水平,通过改变所有电极上的最小刺激水平来改变动态范围(以分贝为单位)。这种操作模拟了T水平测量错误的影响。随着电极动态范围从1分贝变化到20分贝,以及幂律幅度映射函数的指数从0.1变化到0.4,获得了音素识别结果。
对于每种映射条件,电动态范围对元音和辅音识别有显著但较弱的影响。对于强压缩(p = 0.1),在大动态范围(12分贝)下获得了最佳元音和辅音得分。当映射函数的指数变为0.2和0.4时,产生最高得分的动态范围分别降至6分贝和3分贝。
采用4通道CIS策略的音素识别仅受到电阈值和响度映射的大幅变化的轻微影响。阈值变化2倍(6分贝)和响度映射指数变化2倍才会导致性能显著下降。在这些极端条件下,电动态范围对音素识别的影响可能归因于两个独立因素:异常的响度增长和可分辨强度步长数量的减少。只要可分辨强度步长数量适中(例如,>8),电动态范围减小导致的性能下降可以通过更宽泛的幂律映射函数来补偿。