Suppr超能文献

发育性运动言语障碍中单词和句子层面言语可懂度的自动评估:一项跨语言研究。

Automated Assessment of Word- and Sentence-Level Speech Intelligibility in Developmental Motor Speech Disorders: A Cross-Linguistic Investigation.

作者信息

Carl Micalle, Icht Michal

机构信息

Department of Communication Disorders, Ariel University, Ariel 40700, Israel.

出版信息

Diagnostics (Basel). 2025 Jul 28;15(15):1892. doi: 10.3390/diagnostics15151892.

Abstract

: Accurate assessment of speech intelligibility is necessary for individuals with motor speech disorders. Transcription or scaled rating methods by naïve listeners are the most reliable tasks for these purposes; however, they are often resource-intensive and time-consuming within clinical contexts. Automatic speech recognition (ASR) systems, which transcribe speech into text, have been increasingly utilized for assessing speech intelligibility. This study investigates the feasibility of using an open-source ASR system to assess speech intelligibility in Hebrew and English speakers with Down syndrome (DS). : Recordings from 65 Hebrew- and English-speaking participants were included: 33 speakers with DS and 32 typically developing (TD) peers. Speech samples (words, sentences) were transcribed using Whisper (OpenAI) and by naïve listeners. The proportion of agreement between ASR transcriptions and those of naïve listeners was compared across speaker groups (TD, DS) and languages (Hebrew, English) for word-level data. Further comparisons for Hebrew speakers were conducted across speaker groups and stimuli (words, sentences). : The strength of the correlation between listener and ASR transcription scores varied across languages, and was higher for English ( = 0.98) than for Hebrew ( = 0.81) for speakers with DS. A higher proportion of listener-ASR agreement was demonstrated for TD speakers, as compared to those with DS (0.94 vs. 0.74, respectively), and for English, in comparison to Hebrew speakers (0.91 for English DS speakers vs. 0.74 for Hebrew DS speakers). Listener-ASR agreement for single words was consistently higher than for sentences among Hebrew speakers. Speakers' intelligibility influenced word-level agreement among Hebrew- but not English-speaking participants with DS. : ASR performance for English closely approximated that of naïve listeners, suggesting potential near-future clinical applicability within single-word intelligibility assessment. In contrast, a lower proportion of agreement between human listeners and ASR for Hebrew speech indicates that broader clinical implementation may require further training of ASR models in this language.

摘要

对于患有运动性言语障碍的个体而言,准确评估言语可懂度是必要的。由未经专业训练的听众进行转录或采用量表评分方法是实现这些目的最可靠的任务;然而,在临床环境中,它们往往资源消耗大且耗时。自动语音识别(ASR)系统可将语音转录为文本,已越来越多地用于评估言语可懂度。本研究调查了使用开源ASR系统评估患有唐氏综合征(DS)的希伯来语和英语使用者言语可懂度的可行性。

纳入了65名讲希伯来语和英语参与者的录音:33名患有DS的说话者和32名发育正常(TD)的同龄人。使用Whisper(OpenAI)和未经专业训练的听众对语音样本(单词、句子)进行转录。针对单词级数据,比较了ASR转录与未经专业训练的听众转录之间的一致比例,涉及说话者群体(TD、DS)和语言(希伯来语、英语)。针对希伯来语使用者,还在说话者群体和刺激类型(单词、句子)之间进行了进一步比较。

听众与ASR转录分数之间的相关强度因语言而异,对于患有DS的说话者,英语的相关性更高(=0.98),高于希伯来语(=0.81)。与患有DS的说话者相比,TD说话者的听众 - ASR一致性比例更高(分别为0.94对0.74),与希伯来语使用者相比,英语使用者的比例更高(英语DS使用者为0.91,希伯来语DS使用者为0.74)。在希伯来语使用者中,单个单词的听众 - ASR一致性始终高于句子。患有DS的希伯来语使用者的可懂度影响单词级一致性,但英语使用者并非如此。

英语的ASR表现与未经专业训练的听众的表现非常接近,表明在单词可懂度评估方面近期可能具有临床适用性。相比之下,希伯来语语音的人类听众与ASR之间的一致性比例较低,这表明更广泛的临床应用可能需要对该语言的ASR模型进行进一步训练。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca1c/12345943/8612213fd799/diagnostics-15-01892-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验