• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从临床角度看自动语音识别中构音障碍多样性的特征描述:教程

Characterizing Dysarthria Diversity for Automatic Speech Recognition: A Tutorial from the Clinical Perspective.

作者信息

Rowe Hannah P, Gutz Sarah E, Maffei Marc F, Tomanek Katrin, Green Jordan R

机构信息

MGH Institute of Health Professions, Department of Rehabilitation Sciences, Boston, MA, United States.

Harvard University, Department of Speech and Hearing Bioscience and Technology, Boston, MA, United States.

出版信息

Front Comput Sci. 2022 Apr;4. doi: 10.3389/fcomp.2022.770210. Epub 2022 Apr 12.

DOI:10.3389/fcomp.2022.770210
PMID:37860708
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10586392/
Abstract

Despite significant advancements in automatic speech recognition (ASR) technology, even the best performing ASR systems are inadequate for speakers with impaired speech. This inadequacy may be, in part, due to the challenges associated with acquiring a sufficiently diverse training sample of disordered speech. Speakers with dysarthria, which refers to a group of divergent speech disorders secondary to neurologic injury, exhibit highly variable speech patterns both within and across individuals. This diversity is currently poorly characterized and, consequently, difficult to adequately represent in disordered speech ASR corpora. In this paper, we consider the variable expressions of dysarthria within the context of established clinical taxonomies (e.g., Darley, Aronson, and Brown dysarthria subtypes). We also briefly consider past and recent efforts to capture this diversity quantitatively using speech analytics. Understanding dysarthria diversity from the clinical perspective and how this diversity may impact ASR performance could aid in (1) optimizing data collection strategies for minimizing bias; (2) ensuring representative ASR training sets; and (3) improving generalization of ASR across users and performance for difficult-to-recognize speakers. Our overarching goal is to facilitate the development of robust ASR systems for dysarthric speech using clinical knowledge.

摘要

尽管自动语音识别(ASR)技术取得了重大进展,但即使是性能最佳的ASR系统对于言语受损的说话者来说也不够用。这种不足可能部分归因于获取足够多样的言语障碍训练样本所面临的挑战。构音障碍患者,即继发于神经损伤的一组不同的言语障碍患者,在个体内部和个体之间都表现出高度可变的言语模式。目前这种多样性的特征描述很差,因此难以在言语障碍ASR语料库中充分体现。在本文中,我们在既定的临床分类法(例如,达利、阿隆森和布朗构音障碍亚型)的背景下考虑构音障碍的可变表达。我们还简要回顾了过去和最近使用语音分析定量捕捉这种多样性的努力。从临床角度理解构音障碍的多样性以及这种多样性如何影响ASR性能,有助于(1)优化数据收集策略以最小化偏差;(2)确保具有代表性的ASR训练集;(3)提高ASR在不同用户之间的泛化能力以及对难以识别的说话者的性能。我们的总体目标是利用临床知识促进针对构音障碍言语的强大ASR系统的开发。

相似文献

1
Characterizing Dysarthria Diversity for Automatic Speech Recognition: A Tutorial from the Clinical Perspective.从临床角度看自动语音识别中构音障碍多样性的特征描述:教程
Front Comput Sci. 2022 Apr;4. doi: 10.3389/fcomp.2022.770210. Epub 2022 Apr 12.
2
The relationship between perceptual disturbances in dysarthric speech and automatic speech recognition performance.构音障碍性言语中的感知障碍与自动语音识别性能之间的关系。
J Acoust Soc Am. 2016 Nov;140(5):EL416. doi: 10.1121/1.4967208.
3
Speech Vision: An End-to-End Deep Learning-Based Dysarthric Automatic Speech Recognition System.言语视觉:基于端到端深度学习的构音障碍自动语音识别系统。
IEEE Trans Neural Syst Rehabil Eng. 2021;29:852-861. doi: 10.1109/TNSRE.2021.3076778. Epub 2021 May 7.
4
Dysarthric Speech Transformer: A Sequence-to-Sequence Dysarthric Speech Recognition System.构音障碍语音转换器:一种序列到序列的构音障碍语音识别系统。
IEEE Trans Neural Syst Rehabil Eng. 2023;31:3407-3416. doi: 10.1109/TNSRE.2023.3307020. Epub 2023 Aug 29.
5
Severity-based adaptation with limited data for ASR to aid dysarthric speakers.基于严重程度的适应性调整,利用有限数据进行自动语音识别,以帮助构音障碍患者。
PLoS One. 2014 Jan 23;9(1):e86285. doi: 10.1371/journal.pone.0086285. eCollection 2014.
6
Improving Dysarthric Speech Segmentation With Emulated and Synthetic Augmentation.通过仿真和合成增强改进构音障碍语音分割。
IEEE J Transl Eng Health Med. 2024 Mar 11;12:382-389. doi: 10.1109/JTEHM.2024.3375323. eCollection 2024.
7
Validity of Off-the-Shelf Automatic Speech Recognition for Assessing Speech Intelligibility and Speech Severity in Speakers With Amyotrophic Lateral Sclerosis.用于评估肌萎缩侧索硬化症患者言语可懂度和言语严重度的现成自动语音识别的有效性。
J Speech Lang Hear Res. 2022 Jun 8;65(6):2128-2143. doi: 10.1044/2022_JSLHR-21-00589. Epub 2022 May 27.
8
Vocal tract representation in the recognition of cerebral palsied speech.声道特征在脑瘫语音识别中的应用。
J Speech Lang Hear Res. 2012 Aug;55(4):1190-207. doi: 10.1044/1092-4388(2011/11-0223). Epub 2012 Jan 23.
9
Evaluation of an Automatic Speech Recognition Platform for Dysarthric Speech.用于构音障碍语音的自动语音识别平台评估
Folia Phoniatr Logop. 2021;73(5):432-441. doi: 10.1159/000511042. Epub 2020 Nov 13.
10
Automatic Assessment of Intelligibility in Noise in Parkinson Disease: Validation Study.帕金森病噪声环境下言语可懂度的自动评估:验证研究。
J Med Internet Res. 2022 Oct 20;24(10):e40567. doi: 10.2196/40567.

引用本文的文献

1
Automated Assessment of Word- and Sentence-Level Speech Intelligibility in Developmental Motor Speech Disorders: A Cross-Linguistic Investigation.发育性运动言语障碍中单词和句子层面言语可懂度的自动评估:一项跨语言研究。
Diagnostics (Basel). 2025 Jul 28;15(15):1892. doi: 10.3390/diagnostics15151892.
2
An automatic measure for speech intelligibility in dysarthrias-validation across multiple languages and neurological disorders.一种用于构音障碍中言语可懂度的自动测量方法——跨多种语言和神经系统疾病的验证
Front Digit Health. 2024 Jul 23;6:1440986. doi: 10.3389/fdgth.2024.1440986. eCollection 2024.
3
Oral diadochokinetic markers of X-linked dystonia-parkinsonism.

本文引用的文献

1
Quantifying articulatory impairments in neurodegenerative motor diseases: A scoping review and meta-analysis of interpretable acoustic features.量化神经退行性运动疾病中的发音障碍:可解释的声学特征的范围综述和荟萃分析。
Int J Speech Lang Pathol. 2023 Aug;25(4):486-499. doi: 10.1080/17549507.2022.2089234. Epub 2022 Aug 24.
2
Speech intelligibility loss due to amyotrophic lateral sclerosis: the effect of tongue movement reduction on vowel and consonant acoustic features.由于肌萎缩侧索硬化症导致的言语可懂度损失:舌运动减少对元音和辅音声学特征的影响。
Clin Linguist Phon. 2021 Nov 2;35(11):1091-1112. doi: 10.1080/02699206.2020.1868021. Epub 2021 Jan 11.
3
X 连锁型肌张力障碍-帕金森病的口腔交替运动标志物。
Parkinsonism Relat Disord. 2024 Mar;120:105991. doi: 10.1016/j.parkreldis.2024.105991. Epub 2024 Jan 4.
4
Speech, Gait, and Vestibular Function in Cerebellar Ataxia with Neuropathy and Vestibular Areflexia Syndrome.伴有神经病变和前庭无反射综合征的小脑性共济失调患者的言语、步态和前庭功能
Brain Sci. 2023 Oct 17;13(10):1467. doi: 10.3390/brainsci13101467.
5
Feedback From Automatic Speech Recognition to Elicit Clear Speech in Healthy Speakers.从自动语音识别中获取反馈,以促使健康说话者说出清晰的语音。
Am J Speech Lang Pathol. 2023 Nov 6;32(6):2940-2959. doi: 10.1044/2023_AJSLP-23-00030. Epub 2023 Oct 12.
6
Pareto-Optimized Non-Negative Matrix Factorization Approach to the Cleaning of Alaryngeal Speech Signals.用于清洁无喉语音信号的帕累托优化非负矩阵分解方法
Cancers (Basel). 2023 Jul 16;15(14):3644. doi: 10.3390/cancers15143644.
7
Ataxic speech disorders and Parkinson's disease diagnostics via stochastic embedding of empirical mode decomposition.基于经验模态分解的随机嵌入在共济失调性言语障碍和帕金森病诊断中的应用。
PLoS One. 2023 Apr 26;18(4):e0284667. doi: 10.1371/journal.pone.0284667. eCollection 2023.
Automatic speech recognition: A primer for speech-language pathology researchers.
自动语音识别:言语语言病理学研究人员入门指南。
Int J Speech Lang Pathol. 2018 Nov;20(6):599-609. doi: 10.1080/17549507.2018.1510033.
4
Automated Speech Recognition in Adult Stroke Survivors: Comparing Human and Computer Transcriptions.成人中风幸存者的自动语音识别:人工转录与计算机转录的比较
Folia Phoniatr Logop. 2019;71(5-6):286-296. doi: 10.1159/000499156. Epub 2019 May 22.
5
Characteristics of motor speech phenotypes in multiple sclerosis.多发性硬化症的运动言语表型特征。
Mult Scler Relat Disord. 2018 Jan;19:62-69. doi: 10.1016/j.msard.2017.11.007. Epub 2017 Nov 8.
6
The relationship between perceptual disturbances in dysarthric speech and automatic speech recognition performance.构音障碍性言语中的感知障碍与自动语音识别性能之间的关系。
J Acoust Soc Am. 2016 Nov;140(5):EL416. doi: 10.1121/1.4967208.
7
Predicting Early Bulbar Decline in Amyotrophic Lateral Sclerosis: A Speech Subsystem Approach.预测肌萎缩侧索硬化症早期延髓功能衰退:一种言语子系统方法
Behav Neurol. 2015;2015:183027. doi: 10.1155/2015/183027. Epub 2015 Jun 2.
8
A role for amplitude modulation phase relationships in speech rhythm perception.调幅相位关系在语音节奏感知中的作用。
J Acoust Soc Am. 2014 Jul;136(1):366-81. doi: 10.1121/1.4883366.
9
Predicting speech intelligibility with a multiple speech subsystems approach in children with cerebral palsy.采用多语音子系统方法预测脑瘫儿童的言语可懂度。
J Speech Lang Hear Res. 2014 Oct;57(5):1666-78. doi: 10.1044/2014_JSLHR-S-13-0292.
10
Characteristics and occurrence of speech impairment in Huntington's disease: possible influence of antipsychotic medication.亨廷顿舞蹈症患者言语障碍的特征与发生率:抗精神病药物的潜在影响
J Neural Transm (Vienna). 2014 Dec;121(12):1529-39. doi: 10.1007/s00702-014-1229-8. Epub 2014 May 9.