一种基于非下采样小波变换的人工耳蜗语音处理方法。

An Undecimated Wavelet-based Method for Cochlear Implant Speech Processing.

作者信息

Hajiaghababa Fatemeh, Kermani Saeed, Marateb Hamid R

机构信息

Department of Electrical Engineering, Najafabad Branch, Islamic Azad University, Najafabad, Iran.

Department of Medical Physics and Medical Engineering, Isfahan University of Medical Sciences, Isfahan, Iran.

出版信息

J Med Signals Sens. 2014 Oct;4(4):247-55.

PMID:25426428

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4236803/

Abstract

A cochlear implant is an implanted electronic device used to provide a sensation of hearing to a person who is hard of hearing. The cochlear implant is often referred to as a bionic ear. This paper presents an undecimated wavelet-based speech coding strategy for cochlear implants, which gives a novel speech processing strategy. The undecimated wavelet packet transform (UWPT) is computed like the wavelet packet transform except that it does not down-sample the output at each level. The speech data used for the current study consists of 30 consonants, sampled at 16 kbps. The performance of our proposed UWPT method was compared to that of infinite impulse response (IIR) filter in terms of mean opinion score (MOS), short-time objective intelligibility (STOI) measure and segmental signal-to-noise ratio (SNR). Undecimated wavelet had better segmental SNR in about 96% of the input speech data. The MOS of the proposed method was twice in comparison with that of the IIR filter-bank. The statistical analysis revealed that the UWT-based N-of-M strategy significantly improved the MOS, STOI and segmental SNR (P < 0.001) compared with what obtained with the IIR filter-bank based strategies. The advantage of UWPT is that it is shift-invariant which gives a dense approximation to continuous wavelet transform. Thus, the information loss is minimal and that is why the UWPT performance was better than that of traditional filter-bank strategies in speech recognition tests. Results showed that the UWPT could be a promising method for speech coding in cochlear implants, although its computational complexity is higher than that of traditional filter-banks.

摘要

人工耳蜗是一种植入式电子设备，用于为听力障碍者提供听觉感受。人工耳蜗通常被称为仿生耳。本文提出了一种基于非下采样小波的人工耳蜗语音编码策略，该策略给出了一种新颖的语音处理策略。非下采样小波包变换（UWPT）的计算方式与小波包变换类似，只是它在每一级都不会对输出进行下采样。用于当前研究的语音数据由30个辅音组成，采样率为16 kbps。我们提出的UWPT方法的性能在平均意见得分（MOS）、短时客观清晰度（STOI）测量和分段信噪比（SNR）方面与无限脉冲响应（IIR）滤波器进行了比较。在大约96%的输入语音数据中，非下采样小波具有更好的分段信噪比。所提方法的MOS是IIR滤波器组的两倍。统计分析表明，与基于IIR滤波器组的策略相比，基于非下采样小波变换的N-of-M策略显著提高了MOS、STOI和分段信噪比（P < 0.001）。UWPT的优点是它具有平移不变性，这使得它能对连续小波变换给出密集近似。因此，信息损失最小，这就是为什么在语音识别测试中UWPT的性能优于传统滤波器组策略的原因。结果表明，UWPT可能是一种有前途的人工耳蜗语音编码方法，尽管其计算复杂度高于传统滤波器组。