不同合成信号在人工耳蜗声学模型中的性能。

The performance of different synthesis signals in acoustic models of cochlear implants.

机构信息

Department of Electrical, Electronic, and Computer Engineering, University of Pretoria, Pretoria 0002, South Africa.

出版信息

J Acoust Soc Am. 2011 Feb;129(2):920-33. doi: 10.1121/1.3518760.

Abstract

Synthesis (carrier) signals in acoustic models embody assumptions about perception of auditory electric stimulation. This study compared speech intelligibility of consonants and vowels processed through a set of nine acoustic models that used Spectral Peak (SPEAK) and Advanced Combination Encoder (ACE)-like speech processing, using synthesis signals which were representative of signals used previously in acoustic models as well as two new ones. Performance of the synthesis signals was determined in terms of correspondence with cochlear implant (CI) listener results for 12 attributes of phoneme perception (consonant and vowel recognition; F1, F2, and duration information transmission for vowels; voicing, manner, place of articulation, affrication, burst, nasality, and amplitude envelope information transmission for consonants) using four measures of performance. Modulated synthesis signals produced the best correspondence with CI consonant intelligibility, while sinusoids, narrow noise bands, and varying noise bands produced the best correspondence with CI vowel intelligibility. The signals that performed best overall (in terms of correspondence with both vowel and consonant attributes) were modulated and unmodulated noise bands of varying bandwidth that corresponded to a linearly varying excitation width of 0.4 mm at the apical to 8 mm at the basal channels.

摘要

声学分段信号在声学模型中体现了对听觉电刺激感知的假设。本研究通过一组使用频谱峰值（SPEAK）和类似高级组合编码器（ACE）语音处理的九个声学模型比较了通过这些信号处理的辅音和元音的语音可懂度，这些信号使用了之前在声学模型中使用过的以及两种新的合成信号。根据对 12 个音位感知属性（辅音和元音识别；元音的 F1、F2 和持续时间信息传输；辅音的发声、发音方式、发音部位、塞擦音、爆破音、鼻音和幅度包络信息传输）的 Cochlear Implant（CI）听众结果，使用四个性能指标来确定合成信号的性能。调制合成信号与 CI 辅音可懂度的相关性最好，而正弦波、窄带噪声和宽带噪声与 CI 元音可懂度的相关性最好。在整体上表现最好的信号（根据与元音和辅音属性的相关性）是调制和非调制的变带宽噪声带，它们对应于在顶部通道处从 0.4mm 线性变宽到在底部通道处从 8mm 变宽的线性变宽激励宽度。