Department of Otolaryngology, Cheng Hsin General Hospital, Taipei, Taiwan.
Faculty of Medicine, Institute of Brain Science, National Yang Ming Chiao Tung University, Taipei, Taiwan.
J Med Internet Res. 2021 Oct 28;23(10):e25460. doi: 10.2196/25460.
Cochlear implant technology is a well-known approach to help deaf individuals hear speech again and can improve speech intelligibility in quiet conditions; however, it still has room for improvement in noisy conditions. More recently, it has been proven that deep learning-based noise reduction, such as noise classification and deep denoising autoencoder (NC+DDAE), can benefit the intelligibility performance of patients with cochlear implants compared to classical noise reduction algorithms.
Following the successful implementation of the NC+DDAE model in our previous study, this study aimed to propose an advanced noise reduction system using knowledge transfer technology, called NC+DDAE_T; examine the proposed NC+DDAE_T noise reduction system using objective evaluations and subjective listening tests; and investigate which layer substitution of the knowledge transfer technology in the NC+DDAE_T noise reduction system provides the best outcome.
The knowledge transfer technology was adopted to reduce the number of parameters of the NC+DDAE_T compared with the NC+DDAE. We investigated which layer should be substituted using short-time objective intelligibility and perceptual evaluation of speech quality scores as well as t-distributed stochastic neighbor embedding to visualize the features in each model layer. Moreover, we enrolled 10 cochlear implant users for listening tests to evaluate the benefits of the newly developed NC+DDAE_T.
The experimental results showed that substituting the middle layer (ie, the second layer in this study) of the noise-independent DDAE (NI-DDAE) model achieved the best performance gain regarding short-time objective intelligibility and perceptual evaluation of speech quality scores. Therefore, the parameters of layer 3 in the NI-DDAE were chosen to be replaced, thereby establishing the NC+DDAE_T. Both objective and listening test results showed that the proposed NC+DDAE_T noise reduction system achieved similar performances compared with the previous NC+DDAE in several noisy test conditions. However, the proposed NC+DDAE_T only required a quarter of the number of parameters compared to the NC+DDAE.
This study demonstrated that knowledge transfer technology can help reduce the number of parameters in an NC+DDAE while keeping similar performance rates. This suggests that the proposed NC+DDAE_T model may reduce the implementation costs of this noise reduction system and provide more benefits for cochlear implant users.
人工耳蜗技术是一种帮助失聪者重新听到声音的知名方法,它可以提高安静环境下的言语可懂度;然而,在嘈杂环境下,其性能仍有提升空间。最近的研究表明,基于深度学习的降噪方法,如噪声分类和深度降噪自动编码器(NC+DDAE),可以提高人工耳蜗植入患者的言语可懂度,优于传统的降噪算法。
在前一项研究成功实现 NC+DDAE 模型的基础上,本研究旨在提出一种使用知识迁移技术的先进降噪系统,称为 NC+DDAE_T;使用客观评估和主观听力测试来检验所提出的 NC+DDAE_T 降噪系统;并研究知识迁移技术在 NC+DDAE_T 降噪系统中的哪一层替换可以提供最佳效果。
采用知识迁移技术来减少 NC+DDAE_T 与 NC+DDAE 相比的参数数量。我们通过使用短期客观言语可懂度和言语质量感知评估得分以及 t 分布随机近邻嵌入来可视化每个模型层的特征,研究应该替换哪一层。此外,我们招募了 10 名人工耳蜗植入用户进行听力测试,以评估新开发的 NC+DDAE_T 的优势。
实验结果表明,替换噪声独立 DDAE(NI-DDAE)模型的中间层(即本研究中的第二层)可获得最佳的短期客观言语可懂度和言语质量感知评估得分增益。因此,选择替换 NI-DDAE 的第 3 层的参数,从而建立了 NC+DDAE_T。客观和听力测试结果均表明,在几种噪声测试条件下,所提出的 NC+DDAE_T 降噪系统与之前的 NC+DDAE 相比具有相似的性能。然而,与 NC+DDAE 相比,所提出的 NC+DDAE_T 仅需要四分之一的参数数量。
本研究表明,知识迁移技术可以帮助减少 NC+DDAE 的参数数量,同时保持相似的性能水平。这表明所提出的 NC+DDAE_T 模型可能会降低该降噪系统的实施成本,并为人工耳蜗植入者带来更多益处。