Suppr超能文献

构音障碍性言语中的感知障碍与自动语音识别性能之间的关系。

The relationship between perceptual disturbances in dysarthric speech and automatic speech recognition performance.

作者信息

Tu Ming, Wisler Alan, Berisha Visar, Liss Julie M

机构信息

Department of Speech and Hearing Science, Arizona State University, Tempe, Arizona 85287, USA.

School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, Arizona 85287,

出版信息

J Acoust Soc Am. 2016 Nov;140(5):EL416. doi: 10.1121/1.4967208.

Abstract

State-of-the-art automatic speech recognition (ASR) engines perform well on healthy speech; however recent studies show that their performance on dysarthric speech is highly variable. This is because of the acoustic variability associated with the different dysarthria subtypes. This paper aims to develop a better understanding of how perceptual disturbances in dysarthric speech relate to ASR performance. Accurate ratings of a representative set of 32 dysarthric speakers along different perceptual dimensions are obtained and the performance of a representative ASR algorithm on the same set of speakers is analyzed. This work explores the relationship between these ratings and ASR performance and reveals that ASR performance can be predicted from perceptual disturbances in dysarthric speech with articulatory precision contributing the most to the prediction followed by prosody.

摘要

最先进的自动语音识别(ASR)引擎在正常语音上表现良好;然而,最近的研究表明,它们在构音障碍语音上的表现差异很大。这是由于与不同构音障碍亚型相关的声学变异性。本文旨在更好地理解构音障碍语音中的感知障碍与ASR性能之间的关系。获得了32名有代表性的构音障碍患者在不同感知维度上的准确评分,并分析了一种有代表性的ASR算法在同一组患者上的性能。这项工作探索了这些评分与ASR性能之间的关系,并揭示了可以从构音障碍语音中的感知障碍预测ASR性能,其中发音精度对预测的贡献最大,其次是韵律。

相似文献

2
Speech Vision: An End-to-End Deep Learning-Based Dysarthric Automatic Speech Recognition System.
IEEE Trans Neural Syst Rehabil Eng. 2021;29:852-861. doi: 10.1109/TNSRE.2021.3076778. Epub 2021 May 7.
3
Vocal tract representation in the recognition of cerebral palsied speech.
J Speech Lang Hear Res. 2012 Aug;55(4):1190-207. doi: 10.1044/1092-4388(2011/11-0223). Epub 2012 Jan 23.
4
Dysarthric Speech Transformer: A Sequence-to-Sequence Dysarthric Speech Recognition System.
IEEE Trans Neural Syst Rehabil Eng. 2023;31:3407-3416. doi: 10.1109/TNSRE.2023.3307020. Epub 2023 Aug 29.
5
Evaluation of an Automatic Speech Recognition Platform for Dysarthric Speech.
Folia Phoniatr Logop. 2021;73(5):432-441. doi: 10.1159/000511042. Epub 2020 Nov 13.
6
Speech technology-based assessment of phoneme intelligibility in dysarthria.
Int J Lang Commun Disord. 2009 Sep-Oct;44(5):716-30. doi: 10.1080/13682820802342062.
7
A multi-views multi-learners approach towards dysarthric speech recognition using multi-nets artificial neural networks.
IEEE Trans Neural Syst Rehabil Eng. 2014 Sep;22(5):1053-63. doi: 10.1109/TNSRE.2014.2309336. Epub 2014 Mar 11.
8
Intelligibility of dysarthric speech: perceptions of speakers and listeners.
Int J Lang Commun Disord. 2008 Nov-Dec;43(6):633-48. doi: 10.1080/13682820801887117.
9
Characterizing Dysarthria Diversity for Automatic Speech Recognition: A Tutorial from the Clinical Perspective.
Front Comput Sci. 2022 Apr;4. doi: 10.3389/fcomp.2022.770210. Epub 2022 Apr 12.
10
A serious game for speech training in dysarthric speakers with Parkinson's disease: Exploring therapeutic efficacy and patient satisfaction.
Int J Lang Commun Disord. 2022 Jul;57(4):808-821. doi: 10.1111/1460-6984.12722. Epub 2022 Mar 26.

引用本文的文献

1
Automatic Speech Recognition in Primary Progressive Apraxia of Speech.
J Speech Lang Hear Res. 2024 Sep 12;67(9):2964-2976. doi: 10.1044/2024_JSLHR-24-00049. Epub 2024 Aug 6.
2
An automatic measure for speech intelligibility in dysarthrias-validation across multiple languages and neurological disorders.
Front Digit Health. 2024 Jul 23;6:1440986. doi: 10.3389/fdgth.2024.1440986. eCollection 2024.
3
Characterizing Dysarthria Diversity for Automatic Speech Recognition: A Tutorial from the Clinical Perspective.
Front Comput Sci. 2022 Apr;4. doi: 10.3389/fcomp.2022.770210. Epub 2022 Apr 12.
5
Automatic Assessment of Intelligibility in Noise in Parkinson Disease: Validation Study.
J Med Internet Res. 2022 Oct 20;24(10):e40567. doi: 10.2196/40567.
7
Automatic Speech Recognition in Noise for Parkinson's Disease: A Pilot Study.
Front Artif Intell. 2021 Dec 22;4:809321. doi: 10.3389/frai.2021.809321. eCollection 2021.

本文引用的文献

1
Convex weighting criteria for speaking rate estimation.
IEEE/ACM Trans Audio Speech Lang Process. 2015 Sep;23(9):1421-1430. doi: 10.1109/TASLP.2015.2434213.
2
Modeling Pathological Speech Perception From Data With Similarity Labels.
Proc IEEE Int Conf Acoust Speech Signal Process. 2014 May;2014:915-919. doi: 10.1109/ICASSP.2014.6853730.
3
Characterizing the distribution of the quadrilateral vowel space area.
J Acoust Soc Am. 2014 Jan;135(1):421-7. doi: 10.1121/1.4829528.
4
Automatic assessment of vowel space area.
J Acoust Soc Am. 2013 Nov;134(5):EL477-83. doi: 10.1121/1.4826150.
5
Perceptual learning of dysarthric speech: a review of experimental studies.
J Speech Lang Hear Res. 2012 Feb;55(1):290-305. doi: 10.1044/1092-4388(2011/10-0349). Epub 2011 Dec 22.
6
Quantifying speech rhythm abnormalities in the dysarthrias.
J Speech Lang Hear Res. 2009 Oct;52(5):1334-52. doi: 10.1044/1092-4388(2009/08-0208). Epub 2009 Aug 28.
7
Listener agreement for auditory-perceptual ratings of dysarthria.
J Speech Lang Hear Res. 2007 Dec;50(6):1481-95. doi: 10.1044/1092-4388(2007/102).
8
Intelligibility of laryngectomees' substitute speech: automatic speech recognition and subjective rating.
Eur Arch Otorhinolaryngol. 2006 Feb;263(2):188-93. doi: 10.1007/s00405-005-0974-6. Epub 2005 Jul 7.
10
Intelligibility as a linear combination of dimensions in dysarthric speech.
J Commun Disord. 2002 May-Jun;35(3):283-92. doi: 10.1016/s0021-9924(02)00065-5.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验