Suppr超能文献

原发性进行性运动性构音障碍的自动语音识别。

Automatic Speech Recognition in Primary Progressive Apraxia of Speech.

机构信息

Department of Neurology, Mayo Clinic, Rochester, MN.

Department of Radiology, Mayo Clinic, Rochester, MN.

出版信息

J Speech Lang Hear Res. 2024 Sep 12;67(9):2964-2976. doi: 10.1044/2024_JSLHR-24-00049. Epub 2024 Aug 6.

Abstract

INTRODUCTION

Transcribing disordered speech can be useful when diagnosing motor speech disorders such as primary progressive apraxia of speech (PPAOS), who have sound additions, deletions, and substitutions, or distortions and/or slow, segmented speech. Since transcribing speech can be a laborious process and requires an experienced listener, using automatic speech recognition (ASR) systems for diagnosis and treatment monitoring is appealing. This study evaluated the efficacy of a readily available ASR system (wav2vec 2.0) in transcribing speech of PPAOS patients to determine if the word error rate (WER) output by the ASR can differentiate between healthy speech and PPAOS and/or among its subtypes, whether WER correlates with AOS severity, and how the ASR's errors compare to those noted in manual transcriptions.

METHOD

Forty-five patients with PPAOS and 22 healthy controls were recorded repeating 13 words, 3 times each, which were transcribed manually and using wav2vec 2.0. The WER and phonetic and prosodic speech errors were compared between groups, and ASR results were compared against manual transcriptions.

RESULTS

Mean overall WER was 0.88 for patients and 0.33 for controls. WER significantly correlated with AOS severity and accurately distinguished between patients and controls but not between AOS subtypes. The phonetic and prosodic errors from the ASR transcriptions were also unable to distinguish between subtypes, whereas errors calculated from human transcriptions were. There was poor agreement in the number of phonetic and prosodic errors between the ASR and human transcriptions.

CONCLUSIONS

This study demonstrates that ASR can be useful in differentiating healthy from disordered speech and evaluating PPAOS severity but does not distinguish PPAOS subtypes. ASR transcriptions showed weak agreement with human transcriptions; thus, ASR may be a useful tool for the transcription of speech in PPAOS, but the research questions posed must be carefully considered within the context of its limitations.

SUPPLEMENTAL MATERIAL

https://doi.org/10.23641/asha.26359417.

摘要

简介

当诊断运动性言语障碍(如原发性进行性构音障碍,PPAOS)时,转写言语障碍可能非常有用,因为 PPAOS 患者的言语可能会出现声音添加、删除、替换,或扭曲和/或语速缓慢、分段的现象。由于转写言语可能是一个繁琐的过程,并且需要有经验的听众,因此使用自动语音识别(ASR)系统进行诊断和治疗监测是很有吸引力的。本研究评估了一种现成的 ASR 系统(wav2vec 2.0)在转写 PPAOS 患者言语中的功效,以确定 ASR 的单词错误率(WER)输出是否可以区分健康言语和 PPAOS 以及/或其亚型,WER 是否与 AOS 严重程度相关,以及 ASR 的错误与手动转写中的错误有何不同。

方法

记录了 45 名 PPAOS 患者和 22 名健康对照者重复 13 个单词 3 次的语音,这些语音由人工和 wav2vec 2.0 进行转写。比较了组间的 WER 和语音及韵律言语错误,并将 ASR 结果与手动转写进行了比较。

结果

患者的平均总 WER 为 0.88,对照组为 0.33。WER 与 AOS 严重程度显著相关,能够准确区分患者和对照组,但不能区分 AOS 亚型。ASR 转写的语音和韵律错误也无法区分亚型,而人工转写的错误可以。ASR 与人工转写的语音和韵律错误数量之间的一致性较差。

结论

本研究表明,ASR 可用于区分健康和障碍性言语,评估 PPAOS 严重程度,但不能区分 PPAOS 亚型。ASR 转写与人工转写的一致性较弱;因此,ASR 可能是 PPAOS 语音转录的有用工具,但必须在其局限性的背景下仔细考虑提出的研究问题。

补充材料

https://doi.org/10.23641/asha.26359417.

相似文献

1
Automatic Speech Recognition in Primary Progressive Apraxia of Speech.原发性进行性运动性构音障碍的自动语音识别。
J Speech Lang Hear Res. 2024 Sep 12;67(9):2964-2976. doi: 10.1044/2024_JSLHR-24-00049. Epub 2024 Aug 6.
2
Characterizing Speech Errors Across Primary Progressive Apraxia of Speech Subtypes.原发性进行性言语失用症各亚型的言语错误特征分析
J Speech Lang Hear Res. 2024 Mar 11;67(3):811-820. doi: 10.1044/2023_JSLHR-23-00577. Epub 2024 Feb 20.
4
A Longitudinal Evaluation of Speech Rate in Primary Progressive Apraxia of Speech.原发性进行性言语失用症中语速的纵向评估。
J Speech Lang Hear Res. 2021 Feb 17;64(2):392-404. doi: 10.1044/2020_JSLHR-20-00253. Epub 2021 Jan 21.

本文引用的文献

1
Characterizing Speech Errors Across Primary Progressive Apraxia of Speech Subtypes.原发性进行性言语失用症各亚型的言语错误特征分析
J Speech Lang Hear Res. 2024 Mar 11;67(3):811-820. doi: 10.1044/2023_JSLHR-23-00577. Epub 2024 Feb 20.
3
The Apraxia of Speech Rating Scale: Reliability, Validity, and Utility.言语失用症评定量表:信度、效度和实用性。
Am J Speech Lang Pathol. 2023 Mar 9;32(2):469-491. doi: 10.1044/2022_AJSLP-22-00148. Epub 2023 Jan 11.
6
Communication Limitations in Patients With Progressive Apraxia of Speech and Aphasia.进展性运动性失语症伴构音障碍患者的交流障碍。
Am J Speech Lang Pathol. 2020 Nov 12;29(4):1976-1986. doi: 10.1044/2020_AJSLP-20-00012. Epub 2020 Aug 5.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验