• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

语音障碍儿童自动语音识别评估的效用:验证研究

Usefulness of Automatic Speech Recognition Assessment of Children With Speech Sound Disorders: Validation Study.

作者信息

Kim Do Hyung, Jeong Joo Won, Kang Dayoung, Ahn Taekyung, Hong Yeonjung, Im Younggon, Kim Jaewon, Kim Min Jung, Jang Dae-Hyun

机构信息

Department of Rehabilitation Medicine, Incheon St Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea.

Department of English Language and Literature, Korea University, Seoul, Republic of Korea.

出版信息

J Med Internet Res. 2025 Jan 14;27:e60520. doi: 10.2196/60520.

DOI:10.2196/60520
PMID:39576242
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11775490/
Abstract

BACKGROUND

Speech sound disorders (SSDs) are common communication challenges in children, typically assessed by speech-language pathologists (SLPs) using standardized tools. However, traditional evaluation methods are time-intensive and prone to variability, raising concerns about reliability.

OBJECTIVE

This study aimed to compare the evaluation outcomes of SLPs and an automatic speech recognition (ASR) model using two standardized SSD assessments in South Korea, evaluating the ASR model's performance.

METHODS

A fine-tuned wav2vec 2.0 XLS-R model, pretrained on 436,000 hours of adult voice data spanning 128 languages, was used. The model was further trained on 93.6 minutes of children's voices with articulation errors to improve error detection. Participants included children referred to the Department of Rehabilitation Medicine at a general hospital in Incheon, South Korea, from August 19, 2022, to June 14, 2023. Two standardized assessments-the Assessment of Phonology and Articulation for Children (APAC) and the Urimal Test of Articulation and Phonology (U-TAP)-were used, with ASR transcriptions compared to SLP transcriptions.

RESULTS

This study included 30 children aged 3-7 years who were suspected of having SSDs. The phoneme error rates for the APAC and U-TAP were 8.42% (457/5430) and 8.91% (402/4514), respectively, indicating discrepancies between the ASR model and SLP transcriptions across all phonemes. Consonant error rates were 10.58% (327/3090) and 11.86% (331/2790) for the APAC and U-TAP, respectively. On average, there were 2.60 (SD 1.54) and 3.07 (SD 1.39) discrepancies per child for correctly produced phonemes, and 7.87 (SD 3.66) and 7.57 (SD 4.85) discrepancies per child for incorrectly produced phonemes, based on the APAC and U-TAP, respectively. The correlation between SLPs and the ASR model in terms of the percentage of consonants correct was excellent, with an intraclass correlation coefficient of 0.984 (95% CI 0.953-0.994) and 0.978 (95% CI 0.941-0.990) for the APAC and UTAP, respectively. The z scores between SLPs and ASR showed more pronounced differences with the APAC than the U-TAP, with 8 individuals showing discrepancies in the APAC compared to 2 in the U-TAP.

CONCLUSIONS

The results demonstrate the potential of the ASR model in assessing children with SSDs. However, its performance varied based on phoneme or word characteristics, highlighting areas for refinement. Future research should include more diverse speech samples, clinical settings, and speech data to strengthen the model's refinement and ensure broader clinical applicability.

摘要

背景

语音障碍(SSDs)是儿童常见的沟通障碍,通常由言语语言病理学家(SLP)使用标准化工具进行评估。然而,传统的评估方法耗时且容易出现变异性,引发了对可靠性的担忧。

目的

本研究旨在比较韩国言语语言病理学家(SLP)和自动语音识别(ASR)模型使用两种标准化SSD评估的评估结果,评估ASR模型的性能。

方法

使用在436,000小时跨越128种语言的成人语音数据上预训练的微调wav2vec 2.0 XLS-R模型。该模型在93.6分钟有发音错误的儿童语音上进一步训练,以提高错误检测能力。参与者包括2022年8月19日至2023年6月14日转诊至韩国仁川一家综合医院康复医学科的儿童。使用了两种标准化评估——儿童语音和发音评估(APAC)和发音与语音的尿样测试(U-TAP),将ASR转录与SLP转录进行比较。

结果

本研究纳入了30名3至7岁疑似患有语音障碍的儿童。APAC和U-TAP的音素错误率分别为8.42%(457/5430)和8.91%(402/4514),表明ASR模型和SLP转录在所有音素上存在差异。APAC和U-TAP的辅音错误率分别为10.58%(327/3090)和11.86%(331/2790)。基于APAC和U-TAP,每个正确发音的音素平均每个儿童有2.60(标准差1.54)和3.07(标准差1.39)个差异,每个错误发音的音素平均每个儿童有7.87(标准差3.66)和7.57(标准差4.85)个差异。在正确辅音百分比方面SLP与ASR模型之间的相关性非常好,APAC和U-TAP的组内相关系数分别为0.984(95%置信区间0.953 - 0.994)和0.978(95%置信区间0.941 - 0.990)。SLP和ASR之间的z分数在APAC中比在U-TAP中显示出更明显的差异,APAC中有8人存在差异,而U-TAP中有2人存在差异。

结论

结果证明了ASR模型在评估患有语音障碍儿童方面的潜力。然而,其性能因音素或单词特征而异,突出了需要改进的领域。未来的研究应包括更多样化的语音样本、临床环境和语音数据,以加强模型的改进并确保更广泛的临床适用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c871/11775490/8738e8b5350a/jmir_v27i1e60520_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c871/11775490/8738e8b5350a/jmir_v27i1e60520_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c871/11775490/8738e8b5350a/jmir_v27i1e60520_fig1.jpg

相似文献

1
Usefulness of Automatic Speech Recognition Assessment of Children With Speech Sound Disorders: Validation Study.语音障碍儿童自动语音识别评估的效用:验证研究
J Med Internet Res. 2025 Jan 14;27:e60520. doi: 10.2196/60520.
2
The agreement of phonetic transcriptions between paediatric speech and language therapists transcribing a disordered speech sample.儿科言语和语言治疗师转写语音样本的音标转录的一致性。
Int J Lang Commun Disord. 2024 Sep-Oct;59(5):1981-1995. doi: 10.1111/1460-6984.13043. Epub 2024 Jun 8.
3
Automatic speech recognition (ASR) for the diagnosis of pronunciation of speech sound disorders in Korean children.用于诊断韩国儿童语音障碍发音的自动语音识别(ASR)
Clin Linguist Phon. 2024 Aug 20:1-14. doi: 10.1080/02699206.2024.2387609.
4
Investigation of Rapid Naming and Language Skills in Children With Speech Sound Disorders.语音障碍儿童的快速命名和语言技能调查
Int J Lang Commun Disord. 2025 May-Jun;60(3):e70044. doi: 10.1111/1460-6984.70044.
5
Speech sound disorders in a community study of preschool children.学龄前儿童社区研究中的言语障碍。
Am J Speech Lang Pathol. 2013 Aug;22(3):503-22. doi: 10.1044/1058-0360(2012/11-0123). Epub 2013 Jun 28.
6
Is Articulation Assessment via Synchronous Telepractice as Reliable as In-Person Assessment?通过同步远程实践进行发音评估与面对面评估同样可靠吗?
Am J Speech Lang Pathol. 2023 May 4;32(3):1275-1295. doi: 10.1044/2023_AJSLP-22-00172. Epub 2023 Mar 24.
7
Normative and validation data of an articulation test for Italian-speaking children.针对说意大利语儿童的发音测试的常模和验证数据。
Int J Pediatr Otorhinolaryngol. 2018 Jul;110:81-86. doi: 10.1016/j.ijporl.2018.05.002. Epub 2018 May 5.
8
Correlations Between Values of Articulation Tests and Language Tests for Children With Articulation Disorder in Korea.韩国发音障碍儿童的发音测试与语言测试结果之间的相关性
Ann Rehabil Med. 2019 Aug;43(4):483-489. doi: 10.5535/arm.2019.43.4.483. Epub 2019 Aug 31.
9
Transcribing multilingual children's and adults' speech.转写多语言儿童和成人的言语。
Clin Linguist Phon. 2023 Jun 3;37(4-6):415-435. doi: 10.1080/02699206.2022.2051073. Epub 2022 Jun 8.
10
Gender and age biases in the assessment of speech accuracy: A study of speech-language clinicians' ratings of /s/ accuracy.性别和年龄偏见在语音准确性评估中的体现:一项针对言语语言临床医生评估 /s/ 准确性的研究。
Int J Lang Commun Disord. 2024 Nov-Dec;59(6):2878-2895. doi: 10.1111/1460-6984.13122. Epub 2024 Oct 3.

本文引用的文献

1
Automatic speech recognition (ASR) for the diagnosis of pronunciation of speech sound disorders in Korean children.用于诊断韩国儿童语音障碍发音的自动语音识别(ASR)
Clin Linguist Phon. 2024 Aug 20:1-14. doi: 10.1080/02699206.2024.2387609.
2
Speech-Language Pathologists' Ratings of Speech Accuracy in Children With Speech Sound Disorders.言语语言病理学家对言语障碍儿童言语准确性的评估。
Am J Speech Lang Pathol. 2022 Jan 18;31(1):419-430. doi: 10.1044/2021_AJSLP-20-00381. Epub 2021 Nov 17.
3
Automatic speech recognition: A primer for speech-language pathology researchers.
自动语音识别:言语语言病理学研究人员入门指南。
Int J Speech Lang Pathol. 2018 Nov;20(6):599-609. doi: 10.1080/17549507.2018.1510033.
4
Assessment of Dysarthria Using One-Word Speech Recognition with Hidden Markov Models.基于隐马尔可夫模型的单字言语识别在构音障碍评估中的应用。
J Korean Med Sci. 2019 Apr 8;34(13):e108. doi: 10.3346/jkms.2019.34.e108.
5
Automated speech analysis tools for children's speech production: A systematic literature review.用于儿童言语产生的自动语音分析工具:一项系统的文献综述。
Int J Speech Lang Pathol. 2018 Nov;20(6):583-598. doi: 10.1080/17549507.2018.1477991. Epub 2018 Jul 11.
6
Automatic Analysis of Pronunciations for Children with Speech Sound Disorders.语音障碍儿童发音的自动分析
Comput Speech Lang. 2018 Jul;50:62-84. doi: 10.1016/j.csl.2017.12.006. Epub 2017 Dec 27.
7
Speech-language pathologists' practices regarding assessment, analysis, target selection, intervention, and service delivery for children with speech sound disorders.言语语言病理学家针对语音障碍儿童在评估、分析、目标选择、干预及服务提供方面的实践。
Clin Linguist Phon. 2014 Jul-Aug;28(7-8):508-31. doi: 10.3109/02699206.2014.926994.
8
How should children with speech sound disorders be classified? A review and critical evaluation of current classification systems.儿童言语障碍应如何分类?对现行分类系统的回顾和批判性评价。
Int J Lang Commun Disord. 2013 Jan;48(1):25-40. doi: 10.1111/j.1460-6984.2012.00195.x. Epub 2012 Nov 9.
9
The relationship between inexperienced listeners' perceptions and acoustic correlates of children's /r/ productions.缺乏经验的听众的感知与儿童/r/发音的声学关联之间的关系。
Clin Linguist Phon. 2012 Jul;26(7):628-45. doi: 10.3109/02699206.2012.682695.
10
Contribution of consonant versus vowel information to sentence intelligibility for young normal-hearing and elderly hearing-impaired listeners.辅音与元音信息对年轻听力正常和老年听力受损听众句子可懂度的贡献。
J Acoust Soc Am. 2007 Oct;122(4):2365-75. doi: 10.1121/1.2773986.