Ping Lichuan, Wang Ningyuan, Tang Guofang, Lu Thomas, Yin Li, Tu Wenhe, Fu Qian-Jie
a Nurotron Biotechnology, Inc. , Irvine , CA , USA.
b Zhejiang Nurotron Biotechnology Co., Ltd , Zhejiang , PR China.
Cochlear Implants Int. 2017 Sep;18(5):240-249. doi: 10.1080/14670100.2017.1339492. Epub 2017 Jun 20.
Because of limited spectral resolution, Mandarin-speaking cochlear implant (CI) users have difficulty perceiving fundamental frequency (F0) cues that are important to lexical tone recognition. To improve Mandarin tone recognition in CI users, we implemented and evaluated a novel real-time algorithm (C-tone) to enhance the amplitude contour, which is strongly correlated with the F0 contour.
The C-tone algorithm was implemented in clinical processors and evaluated in eight users of the Nurotron NSP-60 CI system. Subjects were given 2 weeks of experience with C-tone. Recognition of Chinese tones, monosyllables, and disyllables in quiet was measured with and without the C-tone algorithm. Subjective quality ratings were also obtained for C-tone.
After 2 weeks of experience with C-tone, there were small but significant improvements in recognition of lexical tones, monosyllables, and disyllables (P < 0.05 in all cases). Among lexical tones, the largest improvements were observed for Tone 3 (falling-rising) and the smallest for Tone 4 (falling). Improvements with C-tone were greater for disyllables than for monosyllables. Subjective quality ratings showed no strong preference for or against C-tone, except for perception of own voice, where C-tone was preferred.
The real-time C-tone algorithm provided small but significant improvements for speech performance in quiet with no change in sound quality. Pre-processing algorithms to reduce noise and better real-time F0 extraction would improve the benefits of C-tone in complex listening environments.
Chinese CI users' speech recognition in quiet can be significantly improved by modifying the amplitude contour to better resemble the F0 contour.
由于频谱分辨率有限,使用普通话的人工耳蜗(CI)使用者在感知对声调识别很重要的基频(F0)线索方面存在困难。为了提高CI使用者的普通话声调识别能力,我们实施并评估了一种新颖的实时算法(C调算法),以增强与F0轮廓密切相关的幅度轮廓。
C调算法在临床处理器中实现,并在八名诺尔康NSP-60 CI系统的使用者中进行评估。受试者使用C调算法两周。在有和没有C调算法的情况下,测量安静环境中汉语声调、单音节和双音节的识别率。还获得了对C调算法的主观质量评分。
使用C调算法两周后,声调、单音节和双音节的识别率有小幅但显著的提高(所有情况下P < 0.05)。在声调中,第三声(降升调)的改善最大,第四声(降调)的改善最小。C调算法对双音节的改善比对单音节的改善更大。主观质量评分显示,除了对自己声音的感知(更倾向于C调算法)外,对C调算法没有强烈的偏好或反感。
实时C调算法在安静环境下对语音表现有小幅但显著的改善,且音质没有变化。减少噪声的预处理算法和更好的实时F0提取将改善C调算法在复杂聆听环境中的效果。
通过修改幅度轮廓使其更接近F0轮廓,可以显著提高中国CI使用者在安静环境下的语音识别能力。