一种使用多网络人工神经网络的多视图多学习者方法用于构音障碍语音识别。

A multi-views multi-learners approach towards dysarthric speech recognition using multi-nets artificial neural networks.

作者信息

Shahamiri Seyed Reza, Salim Siti Salwah Binti

出版信息

IEEE Trans Neural Syst Rehabil Eng. 2014 Sep;22(5):1053-63. doi: 10.1109/TNSRE.2014.2309336. Epub 2014 Mar 11.

DOI:10.1109/TNSRE.2014.2309336

Abstract

Automatic speech recognition (ASR) can be very helpful for speakers who suffer from dysarthria, a neurological disability that damages the control of motor speech articulators. Although a few attempts have been made to apply ASR technologies to sufferers of dysarthria, previous studies show that such ASR systems have not attained an adequate level of performance. In this study, a dysarthric multi-networks speech recognizer (DM-NSR) model is provided using a realization of multi-views multi-learners approach called multi-nets artificial neural networks, which tolerates variability of dysarthric speech. In particular, the DM-NSR model employs several ANNs (as learners) to approximate the likelihood of ASR vocabulary words and to deal with the complexity of dysarthric speech. The proposed DM-NSR approach was presented as both speaker-dependent and speaker-independent paradigms. In order to highlight the performance of the proposed model over legacy models, multi-views single-learner models of the DM-NSRs were also provided and their efficiencies were compared in detail. Moreover, a comparison among the prominent dysarthric ASR methods and the proposed one is provided. The results show that the DM-NSR recorded improved recognition rate by up to 24.67% and the error rate was reduced by up to 8.63% over the reference model.

摘要

自动语音识别（ASR）对于患有构音障碍的说话者非常有帮助，构音障碍是一种神经障碍，会损害运动性言语发音器官的控制。尽管已经有人尝试将ASR技术应用于构音障碍患者，但先前的研究表明，此类ASR系统尚未达到足够的性能水平。在本研究中，使用一种称为多网络人工神经网络的多视图多学习者方法实现，提供了一种构音障碍多网络语音识别器（DM-NSR）模型，该模型能够容忍构音障碍语音的变异性。具体而言，DM-NSR模型采用多个人工神经网络（作为学习者）来近似ASR词汇的可能性，并处理构音障碍语音的复杂性。所提出的DM-NSR方法以依赖说话者和独立于说话者的范式呈现。为了突出所提出模型相对于传统模型的性能，还提供了DM-NSR的多视图单学习者模型，并详细比较了它们的效率。此外，还对突出的构音障碍ASR方法与所提出的方法进行了比较。结果表明，与参考模型相比，DM-NSR的识别率提高了24.67%，错误率降低了8.63%。

相似文献

A multi-views multi-learners approach towards dysarthric speech recognition using multi-nets artificial neural networks.一种使用多网络人工神经网络的多视图多学习者方法用于构音障碍语音识别。

IEEE Trans Neural Syst Rehabil Eng. 2014 Sep;22(5):1053-63. doi: 10.1109/TNSRE.2014.2309336. Epub 2014 Mar 11.

Vocal tract representation in the recognition of cerebral palsied speech.声道特征在脑瘫语音识别中的应用。

J Speech Lang Hear Res. 2012 Aug;55(4):1190-207. doi: 10.1044/1092-4388(2011/11-0223). Epub 2012 Jan 23.

Investigation of an HMM/ANN hybrid structure in pattern recognition application using cepstral analysis of dysarthric (distorted) speech signals.使用构音障碍（失真）语音信号的倒谱分析对隐马尔可夫模型/人工神经网络混合结构在模式识别应用中的研究。

Med Eng Phys. 2006 Oct;28(8):741-8. doi: 10.1016/j.medengphy.2005.11.002. Epub 2005 Dec 15.

The relationship between perceptual disturbances in dysarthric speech and automatic speech recognition performance.构音障碍性言语中的感知障碍与自动语音识别性能之间的关系。

J Acoust Soc Am. 2016 Nov;140(5):EL416. doi: 10.1121/1.4967208.

Evaluation of an Automatic Speech Recognition Platform for Dysarthric Speech.用于构音障碍语音的自动语音识别平台评估

Folia Phoniatr Logop. 2021;73(5):432-441. doi: 10.1159/000511042. Epub 2020 Nov 13.

Dysarthric Speech Transformer: A Sequence-to-Sequence Dysarthric Speech Recognition System.构音障碍语音转换器：一种序列到序列的构音障碍语音识别系统。

IEEE Trans Neural Syst Rehabil Eng. 2023;31:3407-3416. doi: 10.1109/TNSRE.2023.3307020. Epub 2023 Aug 29.

Experiments in dysarthric speech recognition using artificial neural networks.使用人工神经网络进行构音障碍语音识别的实验。

J Rehabil Res Dev. 1995 May;32(2):162-9.

Multi-Stage Audio-Visual Fusion for Dysarthric Speech Recognition With Pre-Trained Models.基于预训练模型的构音障碍语音识别的多阶段视听融合

IEEE Trans Neural Syst Rehabil Eng. 2023;31:1912-1921. doi: 10.1109/TNSRE.2023.3262001.

A speech-controlled environmental control system for people with severe dysarthria.一种用于严重构音障碍患者的语音控制环境控制系统。

Med Eng Phys. 2007 Jun;29(5):586-93. doi: 10.1016/j.medengphy.2006.06.009. Epub 2006 Oct 17.

Speech Vision: An End-to-End Deep Learning-Based Dysarthric Automatic Speech Recognition System.言语视觉：基于端到端深度学习的构音障碍自动语音识别系统。

IEEE Trans Neural Syst Rehabil Eng. 2021;29:852-861. doi: 10.1109/TNSRE.2021.3076778. Epub 2021 May 7.

引用本文的文献

Exploring the Role of Machine Learning in Diagnosing and Treating Speech Disorders: A Systematic Literature Review.探索机器学习在言语障碍诊断与治疗中的作用：一项系统文献综述。

Psychol Res Behav Manag. 2024 May 31;17:2205-2232. doi: 10.2147/PRBM.S460283. eCollection 2024.

Automatic Speech Recognition Performance Improvement for Mandarin Based on Optimizing Gain Control Strategy.基于优化增益控制策略的普通话自动语音识别性能提升

Sensors (Basel). 2022 Apr 15;22(8):3027. doi: 10.3390/s22083027.

Hourly photosynthetically active radiation estimation in Midwestern United States from artificial neural networks and conventional regressions models.基于人工神经网络和传统回归模型对美国中西部地区光合有效辐射的逐小时估算

Int J Biometeorol. 2016 Aug;60(8):1247-59. doi: 10.1007/s00484-015-1120-9. Epub 2015 Dec 29.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种使用多网络人工神经网络的多视图多学习者方法用于构音障碍语音识别。

A multi-views multi-learners approach towards dysarthric speech recognition using multi-nets artificial neural networks.

作者信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献