• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于状态的神经系统疾病所致言语障碍辅助系统直方图

Histogram of States Based Assistive System for Speech Impairment Due to Neurological Disorders.

作者信息

Chandrakala S, Malini S, Veni S Vishnika

出版信息

IEEE Trans Neural Syst Rehabil Eng. 2021;29:2425-2434. doi: 10.1109/TNSRE.2021.3125314. Epub 2021 Nov 25.

DOI:10.1109/TNSRE.2021.3125314
PMID:34735346
Abstract

Assistive speech technology is a challenging task because of the impaired nature of dysarthric speech, such as breathy voice, strained speech, distorted vowels, and consonants. Learning compact and discriminative embeddings for dysarthric speech utterances is essential for impaired speech recognition. We propose a Histogram of States (HoS)-based approach that uses Deep Neural Network-Hidden Markov Model (DNN-HMM) to learn word lattice-based compact and discriminative embeddings. Best state sequence chosen from word lattice is used to represent dysarthric speech utterance. A discriminative model-based classifier is then used to recognize these embeddings. The performance of the proposed approach is evaluated using three datasets, namely 15 acoustically similar words, 100-common words datasets of the UA-SPEECH database, and a 50-words dataset of the TORGO database. The proposed HoS-based approach performs significantly better than the traditional Hidden Markov Model and DNN-HMM-based approaches for all three datasets. The discriminative ability and the compactness of the proposed HoS-based embeddings lead to the best accuracy of impaired speech recognition.

摘要

辅助语音技术是一项具有挑战性的任务,因为构音障碍语音具有受损的特性,例如嗓音微弱、发音费劲、元音和辅音扭曲。为构音障碍语音话语学习紧凑且有区分性的嵌入对于受损语音识别至关重要。我们提出一种基于状态直方图(HoS)的方法,该方法使用深度神经网络-隐马尔可夫模型(DNN-HMM)来学习基于词格的紧凑且有区分性的嵌入。从词格中选择的最佳状态序列用于表示构音障碍语音话语。然后使用基于判别模型的分类器来识别这些嵌入。使用三个数据集对所提出方法的性能进行评估,即15个声学相似词、UA-SPEECH数据库的100个常用词数据集以及TORGO数据库的一个50词数据集。对于所有这三个数据集,所提出的基于HoS的方法的性能明显优于传统的隐马尔可夫模型和基于DNN-HMM的方法。所提出的基于HoS的嵌入的区分能力和紧凑性带来了受损语音识别的最佳准确率。

相似文献

1
Histogram of States Based Assistive System for Speech Impairment Due to Neurological Disorders.基于状态的神经系统疾病所致言语障碍辅助系统直方图
IEEE Trans Neural Syst Rehabil Eng. 2021;29:2425-2434. doi: 10.1109/TNSRE.2021.3125314. Epub 2021 Nov 25.
2
Representation Learning Based Speech Assistive System for Persons With Dysarthria.基于表示学习的构音障碍患者语音辅助系统。
IEEE Trans Neural Syst Rehabil Eng. 2017 Sep;25(9):1510-1517. doi: 10.1109/TNSRE.2016.2638830. Epub 2016 Dec 13.
3
Improving Acoustic Models in TORGO Dysarthric Speech Database.改善 TORGO 构音障碍语音数据库中的声学模型。
IEEE Trans Neural Syst Rehabil Eng. 2018 Mar;26(3):637-645. doi: 10.1109/TNSRE.2018.2802914.
4
Investigation of an HMM/ANN hybrid structure in pattern recognition application using cepstral analysis of dysarthric (distorted) speech signals.使用构音障碍(失真)语音信号的倒谱分析对隐马尔可夫模型/人工神经网络混合结构在模式识别应用中的研究。
Med Eng Phys. 2006 Oct;28(8):741-8. doi: 10.1016/j.medengphy.2005.11.002. Epub 2005 Dec 15.
5
Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition.正则化说话人自适应 KL-HMM 在构音障碍语音识别中的应用。
IEEE Trans Neural Syst Rehabil Eng. 2017 Sep;25(9):1581-1591. doi: 10.1109/TNSRE.2017.2681691. Epub 2017 Mar 13.
6
Estimation of phoneme-specific HMM topologies for the automatic recognition of dysarthric speech.用于语音识别的特定音位 HMM 拓扑结构的估计。
Comput Math Methods Med. 2013;2013:297860. doi: 10.1155/2013/297860. Epub 2013 Oct 8.
7
Effect of high-frequency spectral components in computer recognition of dysarthric speech based on a Mel-cepstral stochastic model.基于梅尔倒谱随机模型的高频谱成分在构音障碍语音计算机识别中的作用
J Rehabil Res Dev. 2005 May-Jun;42(3):363-71. doi: 10.1682/jrrd.2004.06.0067.
8
Assessment of Dysarthria Using One-Word Speech Recognition with Hidden Markov Models.基于隐马尔可夫模型的单字言语识别在构音障碍评估中的应用。
J Korean Med Sci. 2019 Apr 8;34(13):e108. doi: 10.3346/jkms.2019.34.e108.
9
Automated Dysarthria Severity Classification: A Study on Acoustic Features and Deep Learning Techniques.自动构音障碍严重程度分类:声学特征与深度学习技术研究。
IEEE Trans Neural Syst Rehabil Eng. 2022;30:1147-1157. doi: 10.1109/TNSRE.2022.3169814. Epub 2022 May 4.
10
Investigation of Different Time-Frequency Representations for Intelligibility Assessment of Dysarthric Speech.不同时频表示在构音障碍语音可懂度评估中的研究。
IEEE Trans Neural Syst Rehabil Eng. 2020 Dec;28(12):2880-2889. doi: 10.1109/TNSRE.2020.3035392. Epub 2021 Jan 28.

引用本文的文献

1
Pareto-Optimized Non-Negative Matrix Factorization Approach to the Cleaning of Alaryngeal Speech Signals.用于清洁无喉语音信号的帕累托优化非负矩阵分解方法
Cancers (Basel). 2023 Jul 16;15(14):3644. doi: 10.3390/cancers15143644.