Suppr超能文献

改善 TORGO 构音障碍语音数据库中的声学模型。

Improving Acoustic Models in TORGO Dysarthric Speech Database.

出版信息

IEEE Trans Neural Syst Rehabil Eng. 2018 Mar;26(3):637-645. doi: 10.1109/TNSRE.2018.2802914.

Abstract

Assistive speech-based technologies can improve the quality of life for people affected with dysarthria, a motor speech disorder. In this paper, we explore multiple ways to improve Gaussian mixture model and deep neural network (DNN) based hidden Markov model (HMM) automatic speech recognition systems for TORGO dysarthric speech database. This work shows significant improvements over the previous attempts in building such systems in TORGO. We trained speaker-specific acoustic models by tuning various acoustic model parameters, using speaker normalized cepstral features and building complex DNN-HMM models with dropout and sequence-discrimination strategies. The DNN-HMM models for severe and severe-moderate dysarthric speakers were further improved by leveraging specific information from dysarthric speech to DNN models trained on audio files from both dysarthric and normal speech, using generalized distillation framework. To the best of our knowledge, this paper presents the best recognition accuracies for TORGO database till date.

摘要

基于辅助言语的技术可以提高患有构音障碍(一种运动言语障碍)的人的生活质量。在本文中,我们探索了多种方法来改进基于高斯混合模型和深度神经网络(DNN)的隐马尔可夫模型(HMM)自动语音识别系统,以用于 TORGO 构音障碍语音数据库。与之前在 TORGO 中构建此类系统的尝试相比,这项工作取得了显著的改进。我们通过调整各种声学模型参数、使用说话人归一化倒谱系数特征以及构建具有 dropout 和序列判别策略的复杂 DNN-HMM 模型,来训练特定于说话人的声学模型。通过利用从构音障碍语音中提取的特定信息,我们进一步改进了严重和严重中度构音障碍说话人的 DNN-HMM 模型,该信息是针对同时在构音障碍语音和正常语音的音频文件上训练的 DNN 模型使用广义蒸馏框架。据我们所知,本文提出了迄今为止 TORGO 数据库的最佳识别准确率。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验