• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Performance of Forced-Alignment Algorithms on Children's Speech.强制对齐算法在儿童语音上的性能
J Speech Lang Hear Res. 2021 Jun 18;64(6S):2213-2222. doi: 10.1044/2020_JSLHR-20-00268. Epub 2021 Mar 11.
2
Examining Factors Influencing the Viability of Automatic Acoustic Analysis of Child Speech.探究影响儿童语音自动声学分析可行性的因素。
J Speech Lang Hear Res. 2018 Oct 26;61(10):2487-2501. doi: 10.1044/2018_JSLHR-S-17-0275.
3
Automatic alignment for New Englishes: Applying state-of-the-art aligners to Trinidadian English.新英语的自动对齐:将最先进的对齐工具应用于特立尼达英语。
J Acoust Soc Am. 2020 Apr;147(4):2283. doi: 10.1121/10.0001069.
4
Advances in Completely Automated Vowel Analysis for Sociophonetics: Using End-to-End Speech Recognition Systems With DARLA.社会语音学中全自动化元音分析的进展:使用带有DARLA的端到端语音识别系统
Front Artif Intell. 2021 Sep 24;4:662097. doi: 10.3389/frai.2021.662097. eCollection 2021.
5
The Mason-Alberta Phonetic Segmenter: a forced alignment system based on deep neural networks and interpolation.梅森-阿尔伯塔音标分段器:一种基于深度神经网络和插值的强制对齐系统。
Phonetica. 2024 Sep 5;81(5):451-508. doi: 10.1515/phon-2024-0015. Print 2024 Oct 28.
6
Using automatic alignment to analyze endangered language data: testing the viability of untrained alignment.使用自动对齐分析濒危语言数据:测试未训练对齐的可行性。
J Acoust Soc Am. 2013 Sep;134(3):2235-46. doi: 10.1121/1.4816491.
7
Accuracy of Speech Sound Analysis: Comparison of an Automatic Artificial Intelligence Algorithm With Clinician Assessment.语音分析的准确性:自动人工智能算法与临床医生评估的比较。
J Speech Lang Hear Res. 2024 Sep 12;67(9):3004-3021. doi: 10.1044/2024_JSLHR-24-00009. Epub 2024 Aug 22.
8
How children learn to organize their speech gestures: further evidence from fricative-vowel syllables.儿童如何学会组织他们的言语手势:来自擦音-元音音节的进一步证据。
J Speech Hear Res. 1996 Apr;39(2):379-89. doi: 10.1044/jshr.3902.379.
9
The Effects of Speech Compression Algorithms on the Intelligibility of Two Individuals With Dysarthric Speech.语音压缩算法对两位构音障碍者言语清晰度的影响。
Am J Speech Lang Pathol. 2019 Feb 21;28(1):195-203. doi: 10.1044/2018_AJSLP-18-0081.
10
Considering Performance in the Automated and Manual Coding of Sociolinguistic Variables: Lessons From Variable (ING).关于社会语言变量自动编码和手动编码的性能考量:来自变量(ING)的经验教训
Front Artif Intell. 2021 Apr 29;4:648543. doi: 10.3389/frai.2021.648543. eCollection 2021.

引用本文的文献

1
How Does Alignment Error Affect Automated Pronunciation Scoring in Children's Speech?对齐误差如何影响儿童语音的自动发音评分?
Interspeech. 2024 Sep;2024:5133-5137. doi: 10.21437/interspeech.2024-2239.
2
A Pilot Study of Listening Fatigue: Impacts of Pediatric Dysarthria on Adult Listeners.一项关于听力疲劳的初步研究:儿童构音障碍对成年听众的影响。
Am J Speech Lang Pathol. 2025 Jul 29;34(4S):2409-2424. doi: 10.1044/2024_AJSLP-24-00259. Epub 2025 Apr 4.
3
A Tunable Forced Alignment System Based on Deep Learning: Applications to Child Speech.一种基于深度学习的可调谐强制对齐系统:在儿童语音中的应用。
J Speech Lang Hear Res. 2025 Jul 29;68(7S):3583-3601. doi: 10.1044/2024_JSLHR-24-00347. Epub 2025 Mar 31.
4
Speech recognition using an english multimodal corpus with integrated image and depth information.使用集成图像和深度信息的英语多模态语料库进行语音识别。
Sci Rep. 2024 Nov 6;14(1):27000. doi: 10.1038/s41598-024-78557-2.
5
Improving Text-Independent Forced Alignment to Support Speech-Language Pathologists with Phonetic Transcription.提高文本无关强制对齐以支持言语语言病理学家进行音标转写。
Sensors (Basel). 2023 Dec 6;23(24):9650. doi: 10.3390/s23249650.
6
Relating Acoustic Measures to Listener Ratings of Children's Productions of Word-Initial /ɹ/ and /w/.将声学测量与听众对儿童单词首 /ɹ/ 和 /w/ 发音的评分相关联。
J Speech Lang Hear Res. 2023 Sep 13;66(9):3413-3427. doi: 10.1044/2023_JSLHR-22-00713. Epub 2023 Aug 17.
7
Automation of Language Sample Analysis.语言样本分析自动化。
J Speech Lang Hear Res. 2023 Jul 12;66(7):2421-2433. doi: 10.1044/2023_JSLHR-22-00642. Epub 2023 Jun 22.
8
Consonant-Vowel Transition Models Based on Deep Learning for Objective Evaluation of Articulation.基于深度学习的辅音-元音过渡模型用于发音的客观评估。
IEEE/ACM Trans Audio Speech Lang Process. 2023;31:86-95. doi: 10.1109/taslp.2022.3209937. Epub 2022 Oct 10.
9
Speech Development Between 30 and 119 Months in Typical Children II: Articulation Rate Growth Curves.典型儿童 30-119 个月的言语发育 II:构音速率增长曲线。
J Speech Lang Hear Res. 2021 Nov 8;64(11):4057-4070. doi: 10.1044/2021_JSLHR-21-00206. Epub 2021 Sep 29.

本文引用的文献

1
Automatic speech recognition: A primer for speech-language pathology researchers.自动语音识别:言语语言病理学研究人员入门指南。
Int J Speech Lang Pathol. 2018 Nov;20(6):599-609. doi: 10.1080/17549507.2018.1510033.
2
Examining Factors Influencing the Viability of Automatic Acoustic Analysis of Child Speech.探究影响儿童语音自动声学分析可行性的因素。
J Speech Lang Hear Res. 2018 Oct 26;61(10):2487-2501. doi: 10.1044/2018_JSLHR-S-17-0275.
3
Children's Consonant Acquisition in 27 Languages: A Cross-Linguistic Review.27 种语言中儿童的辅音习得:一项跨语言综述。
Am J Speech Lang Pathol. 2018 Nov 21;27(4):1546-1571. doi: 10.1044/2018_AJSLP-17-0100.
4
Methods for eliciting, annotating, and analyzing databases for child speech development.用于引发、注释和分析儿童语言发展数据库的方法。
Comput Speech Lang. 2017 Sep;45:278-299. doi: 10.1016/j.csl.2017.02.010.

强制对齐算法在儿童语音上的性能

Performance of Forced-Alignment Algorithms on Children's Speech.

作者信息

Mahr Tristan J, Berisha Visar, Kawabata Kan, Liss Julie, Hustad Katherine C

机构信息

Waisman Center, University of Wisconsin-Madison.

Department of Communication Sciences and Disorders, Arizona State University, Tempe.

出版信息

J Speech Lang Hear Res. 2021 Jun 18;64(6S):2213-2222. doi: 10.1044/2020_JSLHR-20-00268. Epub 2021 Mar 11.

DOI:10.1044/2020_JSLHR-20-00268
PMID:33705675
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8740721/
Abstract

Purpose Acoustic measurement of speech sounds requires first segmenting the speech signal into relevant units (words, phones, etc.). Manual segmentation is cumbersome and time consuming. Forced-alignment algorithms automate this process by aligning a transcript and a speech sample. We compared the phoneme-level alignment performance of five available forced-alignment algorithms on a corpus of child speech. Our goal was to document aligner performance for child speech researchers. Method The child speech sample included 42 children between 3 and 6 years of age. The corpus was force-aligned using the Montreal Forced Aligner with and without speaker adaptive training, triphone alignment from the Kaldi speech recognition engine, the Prosodylab-Aligner, and the Penn Phonetics Lab Forced Aligner. The sample was also manually aligned to create gold-standard alignments. We evaluated alignment algorithms in terms of accuracy (whether the interval covers the midpoint of the manual alignment) and difference in phone-onset times between the automatic and manual intervals. Results The Montreal Forced Aligner with speaker adaptive training showed the highest accuracy and smallest timing differences. Vowels were consistently the most accurately aligned class of sounds across all the aligners, and alignment accuracy increased with age for fricative sounds across the aligners too. Conclusion The best-performing aligner fell just short of human-level reliability for forced alignment. Researchers can use forced alignment with child speech for certain classes of sounds (vowels, fricatives for older children), especially as part of a semi-automated workflow where alignments are later inspected for gross errors. Supplemental Material https://doi.org/10.23641/asha.14167058.

摘要

目的 语音的声学测量首先需要将语音信号分割成相关单元(单词、音素等)。人工分割既繁琐又耗时。强制对齐算法通过对齐文字记录和语音样本实现这一过程的自动化。我们比较了五种可用的强制对齐算法在儿童语音语料库上的音素级对齐性能。我们的目标是记录儿童语音研究人员使用的对齐器性能。方法 儿童语音样本包括42名3至6岁的儿童。使用带有和不带有说话人自适应训练的蒙特利尔强制对齐器、来自Kaldi语音识别引擎的三音素对齐、韵律实验室对齐器和宾夕法尼亚语音实验室强制对齐器对语料库进行强制对齐。该样本也进行了人工对齐以创建黄金标准对齐。我们根据准确性(自动对齐区间是否覆盖人工对齐的中点)以及自动和人工对齐区间之间音素起始时间的差异来评估对齐算法。结果 带有说话人自适应训练的蒙特利尔强制对齐器显示出最高的准确性和最小的时间差异。在所有对齐器中,元音始终是对齐最准确的音类,并且对于擦音,所有对齐器的对齐准确性都随着年龄增长而提高。结论 性能最佳的对齐器在强制对齐方面仍未达到人类水平的可靠性。研究人员可以将强制对齐用于某些音类的儿童语音(元音、年龄较大儿童的擦音),特别是作为半自动工作流程的一部分,之后可检查对齐中的重大错误。补充材料 https://doi.org/10.23641/asha.14167058 。