用于评估肌萎缩侧索硬化症患者言语可懂度和言语严重度的现成自动语音识别的有效性。

Validity of Off-the-Shelf Automatic Speech Recognition for Assessing Speech Intelligibility and Speech Severity in Speakers With Amyotrophic Lateral Sclerosis.

机构信息

Program in Speech and Hearing Bioscience and Technology, Harvard Medical School, Boston, MA.

Department of Communicative Disorders and Sciences, University at Buffalo, NY.

出版信息

J Speech Lang Hear Res. 2022 Jun 8;65(6):2128-2143. doi: 10.1044/2022_JSLHR-21-00589. Epub 2022 May 27.

DOI:10.1044/2022_JSLHR-21-00589

PMID:35623334

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9567308/

Abstract

PURPOSE

There is increasing interest in using automatic speech recognition (ASR) systems to evaluate impairment severity or speech intelligibility in speakers with dysarthria. We assessed the clinical validity of one currently available off-the-shelf (OTS) ASR system (i.e., a Google Cloud ASR API) for indexing sentence-level speech intelligibility and impairment severity in individuals with amyotrophic lateral sclerosis (ALS), and we provided guidance for potential users of such systems in research and clinic.

METHOD

Using speech samples collected from 52 individuals with ALS and 20 healthy control speakers, we compared word recognition rate (WRR) from the commercially available Google Cloud ASR API (Machine WRR) to clinician-provided judgments of impairment severity, as well as sentence intelligibility (Human WRR). We assessed the internal reliability of Machine and Human WRR by comparing the standard deviation of WRR across sentences to the minimally detectable change (MDC), a clinical benchmark that indicates whether results are within measurement error. We also evaluated Machine and Human WRR diagnostic accuracy for classifying speakers into clinically established categories.

RESULTS

Human WRR achieved better accuracy than Machine WRR when indexing speech severity, and, although related, Human and Machine WRR were not strongly correlated. When the speech signal was mixed with noise (noise-augmented ASR) to reduce a ceiling effect, Machine WRR performance improved. Internal reliability metrics were worse for Machine than Human WRR, particularly for typical and mildly impaired severity groups, although sentence length significantly impacted both Machine and Human WRRs.

CONCLUSIONS

Results indicated that the OTS ASR system was inadequate for early detection of speech impairment and grading overall speech severity. While Machine and Human WRR were correlated, ASR should not be used as a one-to-one proxy for transcription speech intelligibility or clinician severity ratings. Overall, findings suggested that the tested OTS ASR system, Google Cloud ASR, has limited utility for grading clinical speech impairment in speakers with ALS.

摘要

目的

使用自动语音识别（ASR）系统评估构音障碍患者的损伤严重程度或言语可懂度，这方面的兴趣日益增加。我们评估了一种现成的（OTS）ASR 系统（即，Google Cloud ASR API）在索引肌萎缩侧索硬化（ALS）患者的句子级言语可懂度和损伤严重程度方面的临床有效性，并为此类系统的研究和临床应用提供了指导。

方法

使用从 52 名 ALS 患者和 20 名健康对照者收集的语音样本，我们将商业上可用的 Google Cloud ASR API（机器 WRR）提供的单词识别率（WRR）与临床医生提供的损伤严重程度评估，以及句子可懂度（人工 WRR）进行了比较。我们通过将 WRR 在句子之间的标准差与最小可检测变化（MDC）进行比较，评估了机器和人工 WRR 的内部可靠性，MDC 是一个临床基准，表明结果是否在测量误差范围内。我们还评估了机器和人工 WRR 对将说话者分类为临床既定类别的准确性。

结果

在对语音严重程度进行索引时，人工 WRR 的准确性优于机器 WRR，尽管两者相关，但人工和机器 WRR 相关性不强。当语音信号与噪声混合（增强噪声的 ASR）以降低上限效应时，机器 WRR 的性能有所提高。机器的内部可靠性指标比人工 WRR 差，尤其是对于典型和轻度受损的严重程度组，尽管句子长度对机器和人工 WRR 都有显著影响。

结论

结果表明，OTS ASR 系统不适合早期检测语音损伤和总体语音严重程度分级。尽管机器和人工 WRR 相关，但不应将 ASR 用作转录言语可懂度或临床医生严重程度评分的一对一代理。总体而言，研究结果表明，所测试的 OTS ASR 系统 Google Cloud ASR 对 ALS 患者的临床语音损伤分级的实用性有限。

相似文献

Validity of Off-the-Shelf Automatic Speech Recognition for Assessing Speech Intelligibility and Speech Severity in Speakers With Amyotrophic Lateral Sclerosis.用于评估肌萎缩侧索硬化症患者言语可懂度和言语严重度的现成自动语音识别的有效性。

J Speech Lang Hear Res. 2022 Jun 8;65(6):2128-2143. doi: 10.1044/2022_JSLHR-21-00589. Epub 2022 May 27.

Automatic Assessment of Intelligibility in Noise in Parkinson Disease: Validation Study.帕金森病噪声环境下言语可懂度的自动评估：验证研究。

J Med Internet Res. 2022 Oct 20;24(10):e40567. doi: 10.2196/40567.

The use of speech recognition technology by people living with amyotrophic lateral sclerosis: a scoping review.肌萎缩侧索硬化症患者使用语音识别技术：范围综述。

Disabil Rehabil Assist Technol. 2023 Oct;18(7):1043-1055. doi: 10.1080/17483107.2021.1974961. Epub 2021 Sep 11.

Shorter Sentence Length Maximizes Intelligibility and Speech Motor Performance in Persons With Dysarthria Due to Amyotrophic Lateral Sclerosis.短句长度最大化了肌萎缩性侧索硬化症所致构音障碍患者的可理解性和言语运动表现。

Am J Speech Lang Pathol. 2019 Feb 21;28(1):96-107. doi: 10.1044/2018_AJSLP-18-0049.

Minimally Detectable Change of Speech Intelligibility in Speakers With Multiple Sclerosis and Parkinson's Disease.多发性硬化症和帕金森病患者言语可懂度的最小可检测变化。

J Speech Lang Hear Res. 2022 May 11;65(5):1858-1866. doi: 10.1044/2022_JSLHR-21-00648. Epub 2022 Apr 20.

Minimally Detectable Change and Minimal Clinically Important Difference of a Decline in Sentence Intelligibility and Speaking Rate for Individuals With Amyotrophic Lateral Sclerosis.肌萎缩侧索硬化患者言语清晰度和言语速率下降的最小可检测变化和最小临床重要差异。

J Speech Lang Hear Res. 2018 Nov 8;61(11):2757-2771. doi: 10.1044/2018_JSLHR-S-17-0366.

Perceptual measures of speech from individuals with Parkinson's disease and multiple sclerosis: intelligibility and beyond.帕金森病和多发性硬化症患者言语的感知测量：可懂度及其他。

J Speech Lang Hear Res. 2012 Aug;55(4):1208-19. doi: 10.1044/1092-4388(2011/11-0048). Epub 2012 Jan 9.

"You Say Severe, I Say Mild": Toward an Empirical Classification of Dysarthria Severity.“你说严重，我说轻度”：迈向构音障碍严重程度的实证分类。

J Speech Lang Hear Res. 2021 Dec 13;64(12):4718-4735. doi: 10.1044/2021_JSLHR-21-00197. Epub 2021 Nov 11.

Articulatory Kinematic Characteristics Across the Dysarthria Severity Spectrum in Individuals With Amyotrophic Lateral Sclerosis.肌萎缩侧索硬化症患者的构音运动特征横跨构音障碍严重程度谱。

Am J Speech Lang Pathol. 2018 Feb 6;27(1):258-269. doi: 10.1044/2017_AJSLP-16-0230.

Assessing speech intelligibility of pathological speech in sentences and word lists: The contribution of phoneme-level measures.评估句子和单词列表中病理性言语的言语可懂度：音素水平测量的作用。

J Commun Disord. 2023 Mar-Apr;102:106301. doi: 10.1016/j.jcomdis.2023.106301. Epub 2023 Jan 25.

引用本文的文献

Automated Assessment of Word- and Sentence-Level Speech Intelligibility in Developmental Motor Speech Disorders: A Cross-Linguistic Investigation.发育性运动言语障碍中单词和句子层面言语可懂度的自动评估：一项跨语言研究。

Diagnostics (Basel). 2025 Jul 28;15(15):1892. doi: 10.3390/diagnostics15151892.

Automatic speech analysis combined with machine learning reliably predicts the motor state in people with Parkinson's disease.自动语音分析结合机器学习能够可靠地预测帕金森病患者的运动状态。

NPJ Parkinsons Dis. 2025 May 2;11(1):105. doi: 10.1038/s41531-025-00959-4.

Progress Toward Estimating the Minimal Clinically Important Difference of Intelligibility: A Crowdsourced Perceptual Experiment.估计可懂度最小临床重要差异的进展：一项众包感知实验。

J Speech Lang Hear Res. 2025 Jul 29;68(7S):3480-3494. doi: 10.1044/2024_JSLHR-24-00354. Epub 2024 Oct 24.

Automatic Speech Recognition in Primary Progressive Apraxia of Speech.原发性进行性运动性构音障碍的自动语音识别。

J Speech Lang Hear Res. 2024 Sep 12;67(9):2964-2976. doi: 10.1044/2024_JSLHR-24-00049. Epub 2024 Aug 6.

An automatic measure for speech intelligibility in dysarthrias-validation across multiple languages and neurological disorders.一种用于构音障碍中言语可懂度的自动测量方法——跨多种语言和神经系统疾病的验证

Front Digit Health. 2024 Jul 23;6:1440986. doi: 10.3389/fdgth.2024.1440986. eCollection 2024.

Automatic Speech Recognition of Conversational Speech in Individuals With Disordered Speech.口语障碍者会话语音的自动语音识别。

J Speech Lang Hear Res. 2024 Nov 7;67(11):4176-4185. doi: 10.1044/2024_JSLHR-24-00045. Epub 2024 Jul 4.

Profiles of Dysarthria: Clinical Assessment and Treatment.构音障碍概述：临床评估与治疗

Brain Sci. 2023 Dec 22;14(1):11. doi: 10.3390/brainsci14010011.

Feedback From Automatic Speech Recognition to Elicit Clear Speech in Healthy Speakers.从自动语音识别中获取反馈，以促使健康说话者说出清晰的语音。

Am J Speech Lang Pathol. 2023 Nov 6;32(6):2940-2959. doi: 10.1044/2023_AJSLP-23-00030. Epub 2023 Oct 12.

Oromotor skills in autism spectrum disorder: A scoping review.自闭症谱系障碍中的口部运动技能：范围综述。

Autism Res. 2023 May;16(5):879-917. doi: 10.1002/aur.2923. Epub 2023 Apr 3.

Automatic Assessment of Intelligibility in Noise in Parkinson Disease: Validation Study.帕金森病噪声环境下言语可懂度的自动评估：验证研究。

J Med Internet Res. 2022 Oct 20;24(10):e40567. doi: 10.2196/40567.

本文引用的文献

"You Say Severe, I Say Mild": Toward an Empirical Classification of Dysarthria Severity.“你说严重，我说轻度”：迈向构音障碍严重程度的实证分类。

J Speech Lang Hear Res. 2021 Dec 13;64(12):4718-4735. doi: 10.1044/2021_JSLHR-21-00197. Epub 2021 Nov 11.

Verification, analytical validation, and clinical validation (V3): the foundation of determining fit-for-purpose for Biometric Monitoring Technologies (BioMeTs).验证、分析验证和临床验证（V3）：确定生物识别监测技术（BioMeTs）适用性的基础。

NPJ Digit Med. 2020 Apr 14;3:55. doi: 10.1038/s41746-020-0260-4. eCollection 2020.

Automated assessment of psychiatric disorders using speech: A systematic review.使用语音对精神疾病进行自动评估：一项系统综述。

Laryngoscope Investig Otolaryngol. 2020 Jan 31;5(1):96-116. doi: 10.1002/lio2.354. eCollection 2020 Feb.

Feasibility of Automatic Speech Recognition for Providing Feedback During Tablet-Based Treatment for Apraxia of Speech Plus Aphasia.基于平板电脑的言语失用症加失语症治疗期间自动语音识别提供反馈的可行性。

Am J Speech Lang Pathol. 2019 Jul 15;28(2S):818-834. doi: 10.1044/2018_AJSLP-MSC18-18-0109.

Automatic speech recognition: A primer for speech-language pathology researchers.自动语音识别：言语语言病理学研究人员入门指南。

Int J Speech Lang Pathol. 2018 Nov;20(6):599-609. doi: 10.1080/17549507.2018.1510033.

Automated Speech Recognition in Adult Stroke Survivors: Comparing Human and Computer Transcriptions.成人中风幸存者的自动语音识别：人工转录与计算机转录的比较

Folia Phoniatr Logop. 2019;71(5-6):286-296. doi: 10.1159/000499156. Epub 2019 May 22.

Am J Speech Lang Pathol. 2019 Feb 21;28(1):96-107. doi: 10.1044/2018_AJSLP-18-0049.

J Speech Lang Hear Res. 2018 Nov 8;61(11):2757-2771. doi: 10.1044/2018_JSLHR-S-17-0366.

Towards an automatic evaluation of the dysarthria level of patients with Parkinson's disease.迈向帕金森病患者构音障碍程度的自动评估

J Commun Disord. 2018 Nov-Dec;76:21-36. doi: 10.1016/j.jcomdis.2018.08.002. Epub 2018 Aug 20.

A Speech Recognition-based Solution for the Automatic Detection of Mild Cognitive Impairment from Spontaneous Speech.一种基于语音识别的解决方案，用于从自发语音中自动检测轻度认知障碍。

Curr Alzheimer Res. 2018;15(2):130-138. doi: 10.2174/1567205014666171121114930.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验