基于状态的神经系统疾病所致言语障碍辅助系统直方图

Histogram of States Based Assistive System for Speech Impairment Due to Neurological Disorders.

作者信息

Chandrakala S, Malini S, Veni S Vishnika

出版信息

IEEE Trans Neural Syst Rehabil Eng. 2021;29:2425-2434. doi: 10.1109/TNSRE.2021.3125314. Epub 2021 Nov 25.

DOI:10.1109/TNSRE.2021.3125314

Abstract

Assistive speech technology is a challenging task because of the impaired nature of dysarthric speech, such as breathy voice, strained speech, distorted vowels, and consonants. Learning compact and discriminative embeddings for dysarthric speech utterances is essential for impaired speech recognition. We propose a Histogram of States (HoS)-based approach that uses Deep Neural Network-Hidden Markov Model (DNN-HMM) to learn word lattice-based compact and discriminative embeddings. Best state sequence chosen from word lattice is used to represent dysarthric speech utterance. A discriminative model-based classifier is then used to recognize these embeddings. The performance of the proposed approach is evaluated using three datasets, namely 15 acoustically similar words, 100-common words datasets of the UA-SPEECH database, and a 50-words dataset of the TORGO database. The proposed HoS-based approach performs significantly better than the traditional Hidden Markov Model and DNN-HMM-based approaches for all three datasets. The discriminative ability and the compactness of the proposed HoS-based embeddings lead to the best accuracy of impaired speech recognition.

摘要

辅助语音技术是一项具有挑战性的任务，因为构音障碍语音具有受损的特性，例如嗓音微弱、发音费劲、元音和辅音扭曲。为构音障碍语音话语学习紧凑且有区分性的嵌入对于受损语音识别至关重要。我们提出一种基于状态直方图（HoS）的方法，该方法使用深度神经网络-隐马尔可夫模型（DNN-HMM）来学习基于词格的紧凑且有区分性的嵌入。从词格中选择的最佳状态序列用于表示构音障碍语音话语。然后使用基于判别模型的分类器来识别这些嵌入。使用三个数据集对所提出方法的性能进行评估，即15个声学相似词、UA-SPEECH数据库的100个常用词数据集以及TORGO数据库的一个50词数据集。对于所有这三个数据集，所提出的基于HoS的方法的性能明显优于传统的隐马尔可夫模型和基于DNN-HMM的方法。所提出的基于HoS的嵌入的区分能力和紧凑性带来了受损语音识别的最佳准确率。

相似文献

Histogram of States Based Assistive System for Speech Impairment Due to Neurological Disorders.基于状态的神经系统疾病所致言语障碍辅助系统直方图

IEEE Trans Neural Syst Rehabil Eng. 2021;29:2425-2434. doi: 10.1109/TNSRE.2021.3125314. Epub 2021 Nov 25.

Representation Learning Based Speech Assistive System for Persons With Dysarthria.基于表示学习的构音障碍患者语音辅助系统。

IEEE Trans Neural Syst Rehabil Eng. 2017 Sep;25(9):1510-1517. doi: 10.1109/TNSRE.2016.2638830. Epub 2016 Dec 13.

Improving Acoustic Models in TORGO Dysarthric Speech Database.改善 TORGO 构音障碍语音数据库中的声学模型。

IEEE Trans Neural Syst Rehabil Eng. 2018 Mar;26(3):637-645. doi: 10.1109/TNSRE.2018.2802914.

Investigation of an HMM/ANN hybrid structure in pattern recognition application using cepstral analysis of dysarthric (distorted) speech signals.使用构音障碍（失真）语音信号的倒谱分析对隐马尔可夫模型/人工神经网络混合结构在模式识别应用中的研究。

Med Eng Phys. 2006 Oct;28(8):741-8. doi: 10.1016/j.medengphy.2005.11.002. Epub 2005 Dec 15.

Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition.正则化说话人自适应 KL-HMM 在构音障碍语音识别中的应用。

IEEE Trans Neural Syst Rehabil Eng. 2017 Sep;25(9):1581-1591. doi: 10.1109/TNSRE.2017.2681691. Epub 2017 Mar 13.

Estimation of phoneme-specific HMM topologies for the automatic recognition of dysarthric speech.用于语音识别的特定音位 HMM 拓扑结构的估计。

Comput Math Methods Med. 2013;2013:297860. doi: 10.1155/2013/297860. Epub 2013 Oct 8.

Effect of high-frequency spectral components in computer recognition of dysarthric speech based on a Mel-cepstral stochastic model.基于梅尔倒谱随机模型的高频谱成分在构音障碍语音计算机识别中的作用

J Rehabil Res Dev. 2005 May-Jun;42(3):363-71. doi: 10.1682/jrrd.2004.06.0067.

Assessment of Dysarthria Using One-Word Speech Recognition with Hidden Markov Models.基于隐马尔可夫模型的单字言语识别在构音障碍评估中的应用。

J Korean Med Sci. 2019 Apr 8;34(13):e108. doi: 10.3346/jkms.2019.34.e108.

Automated Dysarthria Severity Classification: A Study on Acoustic Features and Deep Learning Techniques.自动构音障碍严重程度分类：声学特征与深度学习技术研究。

IEEE Trans Neural Syst Rehabil Eng. 2022;30:1147-1157. doi: 10.1109/TNSRE.2022.3169814. Epub 2022 May 4.

Investigation of Different Time-Frequency Representations for Intelligibility Assessment of Dysarthric Speech.不同时频表示在构音障碍语音可懂度评估中的研究。

IEEE Trans Neural Syst Rehabil Eng. 2020 Dec;28(12):2880-2889. doi: 10.1109/TNSRE.2020.3035392. Epub 2021 Jan 28.

引用本文的文献

Pareto-Optimized Non-Negative Matrix Factorization Approach to the Cleaning of Alaryngeal Speech Signals.用于清洁无喉语音信号的帕累托优化非负矩阵分解方法

Cancers (Basel). 2023 Jul 16;15(14):3644. doi: 10.3390/cancers15143644.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于状态的神经系统疾病所致言语障碍辅助系统直方图

Histogram of States Based Assistive System for Speech Impairment Due to Neurological Disorders.

作者信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献