• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于卷积神经网络的构音障碍语音增强。

Dysarthric Speech Enhancement Based on Convolution Neural Network.

出版信息

Annu Int Conf IEEE Eng Med Biol Soc. 2022 Jul;2022:60-64. doi: 10.1109/EMBC48229.2022.9871531.

DOI:10.1109/EMBC48229.2022.9871531
PMID:36085875
Abstract

Generally, those patients with dysarthria utter a distorted sound and the restrained intelligibility of a speech for both human and machine. To enhance the intelligibility of dysarthric speech, we applied a deep learning-based speech enhancement (SE) system in this task. Conventional SE approaches are used for shrinking noise components from the noise-corrupted input, and thus improve the sound quality and intelligibility simultaneously. In this study, we are focusing on reconstructing the severely distorted signal from the dysarthric speech for improving intelligibility. The proposed SE system prepares a convolutional neural network (CNN) model in the training phase, which is then used to process the dysarthric speech in the testing phase. During training, paired dysarthric-normal speech utterances are required. We adopt a dynamic time warping technique to align the dysarthric-normal utter-ances. The gained training data are used to train a CNN - based SE model. The proposed SE system is evaluated on the Google automatic speech recognition (ASR) system and a subjective listening test. The results showed that the proposed method could notably enhance the recognition performance for more than 10% in each of ASR and human recognitions from the unprocessed dysarthric speech. Clinical Relevance- This study enhances the intelligibility and ASR accuracy from a dysarthria speech to more than 10.

摘要

一般来说,那些患有构音障碍的患者会发出扭曲的声音,并且人机对话的可理解度都受到限制。为了提高构音障碍语音的可理解度,我们在这项任务中应用了基于深度学习的语音增强(SE)系统。传统的 SE 方法用于从噪声污染的输入中缩小噪声分量,从而同时提高声音质量和可理解度。在这项研究中,我们专注于从构音障碍语音中重建严重失真的信号,以提高可理解度。所提出的 SE 系统在训练阶段准备卷积神经网络(CNN)模型,然后在测试阶段用于处理构音障碍语音。在训练期间,需要配对的构音障碍-正常语音语句。我们采用动态时间规整技术对齐构音障碍-正常语音语句。获得的训练数据用于训练基于 CNN 的 SE 模型。所提出的 SE 系统在 Google 自动语音识别(ASR)系统和主观听力测试上进行了评估。结果表明,与未经处理的构音障碍语音相比,该方法可以显著提高超过 10%的 ASR 和人类识别的识别性能。临床相关性-本研究提高了构音障碍语音的可理解度和 ASR 准确性,超过 10。

相似文献

1
Dysarthric Speech Enhancement Based on Convolution Neural Network.基于卷积神经网络的构音障碍语音增强。
Annu Int Conf IEEE Eng Med Biol Soc. 2022 Jul;2022:60-64. doi: 10.1109/EMBC48229.2022.9871531.
2
Phonetic posteriorgram-based voice conversion system to improve speech intelligibility of dysarthric patients.基于语音后图的语音转换系统,提高构音障碍患者的言语可懂度。
Comput Methods Programs Biomed. 2022 Mar;215:106602. doi: 10.1016/j.cmpb.2021.106602. Epub 2021 Dec 26.
3
Multi-Stage Audio-Visual Fusion for Dysarthric Speech Recognition With Pre-Trained Models.基于预训练模型的构音障碍语音识别的多阶段视听融合
IEEE Trans Neural Syst Rehabil Eng. 2023;31:1912-1921. doi: 10.1109/TNSRE.2023.3262001.
4
Speech Vision: An End-to-End Deep Learning-Based Dysarthric Automatic Speech Recognition System.言语视觉:基于端到端深度学习的构音障碍自动语音识别系统。
IEEE Trans Neural Syst Rehabil Eng. 2021;29:852-861. doi: 10.1109/TNSRE.2021.3076778. Epub 2021 May 7.
5
Dysarthric Speech Transformer: A Sequence-to-Sequence Dysarthric Speech Recognition System.构音障碍语音转换器:一种序列到序列的构音障碍语音识别系统。
IEEE Trans Neural Syst Rehabil Eng. 2023;31:3407-3416. doi: 10.1109/TNSRE.2023.3307020. Epub 2023 Aug 29.
6
Representation Learning Based Speech Assistive System for Persons With Dysarthria.基于表示学习的构音障碍患者语音辅助系统。
IEEE Trans Neural Syst Rehabil Eng. 2017 Sep;25(9):1510-1517. doi: 10.1109/TNSRE.2016.2638830. Epub 2016 Dec 13.
7
The Application of Time-Frequency Masking To Improve Intelligibility of Dysarthric Speech in Background Noise.时频掩蔽在背景噪声下改善构音障碍语音可懂度的应用。
J Speech Lang Hear Res. 2023 May 9;66(5):1853-1866. doi: 10.1044/2023_JSLHR-22-00558. Epub 2023 Mar 21.
8
Evaluation of an Automatic Speech Recognition Platform for Dysarthric Speech.用于构音障碍语音的自动语音识别平台评估
Folia Phoniatr Logop. 2021;73(5):432-441. doi: 10.1159/000511042. Epub 2020 Nov 13.
9
A multi-views multi-learners approach towards dysarthric speech recognition using multi-nets artificial neural networks.一种使用多网络人工神经网络的多视图多学习者方法用于构音障碍语音识别。
IEEE Trans Neural Syst Rehabil Eng. 2014 Sep;22(5):1053-63. doi: 10.1109/TNSRE.2014.2309336. Epub 2014 Mar 11.
10
Beyond Speech Intelligibility: Quantifying Behavioral and Perceived Listening Effort in Response to Dysarthric Speech.超越语音可懂度:量化行为和感知聆听努力以响应构音障碍语音。
J Speech Lang Hear Res. 2022 Nov 17;65(11):4060-4070. doi: 10.1044/2022_JSLHR-22-00136. Epub 2022 Oct 5.