• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Transfer Learning from Adult to Children for Speech Recognition: Evaluation, Analysis and Recommendations.从成人到儿童的语音识别迁移学习:评估、分析与建议
Comput Speech Lang. 2020 Sep;63. doi: 10.1016/j.csl.2020.101077. Epub 2020 Feb 18.
2
Machine learning based sample extraction for automatic speech recognition using dialectal Assamese speech.基于机器学习的方言阿萨姆语语音自动识别样本提取。
Neural Netw. 2016 Jun;78:97-111. doi: 10.1016/j.neunet.2015.12.010. Epub 2015 Dec 30.
3
Methods for eliciting, annotating, and analyzing databases for child speech development.用于引发、注释和分析儿童语言发展数据库的方法。
Comput Speech Lang. 2017 Sep;45:278-299. doi: 10.1016/j.csl.2017.02.010.
4
Domain Adaptation with Augmented Data by Deep Neural Network Based Method Using Re-Recorded Speech for Automatic Speech Recognition in Real Environment.基于深度神经网络的扩充数据域自适应方法在真实环境下的自动语音识别中的再录音语音应用。
Sensors (Basel). 2022 Dec 16;22(24):9945. doi: 10.3390/s22249945.
5
Finnish parliament ASR corpus: Analysis, benchmarks and statistics.芬兰议会ASR语料库:分析、基准与统计数据。
Lang Resour Eval. 2023 Mar 27:1-26. doi: 10.1007/s10579-023-09650-7.
6
Automatic speech recognition (ASR) for the diagnosis of pronunciation of speech sound disorders in Korean children.用于诊断韩国儿童语音障碍发音的自动语音识别(ASR)
Clin Linguist Phon. 2024 Aug 20:1-14. doi: 10.1080/02699206.2024.2387609.
7
Recognizing Whispered Speech Produced by an Individual with Surgically Reconstructed Larynx Using Articulatory Movement Data.利用发音运动数据识别接受喉部手术重建的个体所发出的低语语音。
Workshop Speech Lang Process Assist Technol. 2016 Sep;2016:80-86. doi: 10.21437/SLPAT.2016-14.
8
Improving Acoustic Models in TORGO Dysarthric Speech Database.改善 TORGO 构音障碍语音数据库中的声学模型。
IEEE Trans Neural Syst Rehabil Eng. 2018 Mar;26(3):637-645. doi: 10.1109/TNSRE.2018.2802914.
9
Improving ASR Systems for Children with Autism and Language Impairment Using Domain-Focused DNN Transfer Techniques.使用领域聚焦的深度神经网络迁移技术改进自闭症和语言障碍儿童的自动语音识别系统
Interspeech. 2019 Sep;2019:11-15. doi: 10.21437/Interspeech.2019-3161.
10
Audio Augmentation for Non-Native Children's Speech Recognition through Discriminative Learning.通过判别式学习实现非母语儿童语音识别的音频增强
Entropy (Basel). 2022 Oct 19;24(10):1490. doi: 10.3390/e24101490.

引用本文的文献

1
HiACC: Hinglish adult & children code-switched corpus.HiACC:印式英语成人与儿童语码转换语料库。
Data Brief. 2025 Jul 17;62:111886. doi: 10.1016/j.dib.2025.111886. eCollection 2025 Oct.
2
Computer-Assisted Syllable Complexity Analysis of Continuous Speech as a Measure of Child Speech Disorders.作为儿童言语障碍衡量指标的连续语音计算机辅助音节复杂性分析
Proc Int Congr Phon Sci. 2019 Aug;2019:1054-1058.
3
Computer-assisted syllable analysis of continuous speech as a measure of child speech disordera).计算机辅助连续语音音节分析作为儿童言语障碍的一种测量方法(a)。
J Acoust Soc Am. 2024 Aug 1;156(2):1171-1182. doi: 10.1121/10.0028176.
4
Audio Augmentation for Non-Native Children's Speech Recognition through Discriminative Learning.通过判别式学习实现非母语儿童语音识别的音频增强
Entropy (Basel). 2022 Oct 19;24(10):1490. doi: 10.3390/e24101490.
5
REFINING AUTOMATIC SPEECH RECOGNITION SYSTEM FOR OLDER ADULTS.优化老年人自动语音识别系统
Proc IEEE Int Conf Acoust Speech Signal Process. 2021 Jun;2021:7003-7007. doi: 10.1109/icassp39728.2021.9414207. Epub 2021 May 13.
6
Tracking Child Language Development With Neural Network Language Models.利用神经网络语言模型追踪儿童语言发展
Front Psychol. 2021 Jul 8;12:674402. doi: 10.3389/fpsyg.2021.674402. eCollection 2021.
7
COVID-19 Diagnosis via DenseNet and Optimization of Transfer Learning Setting.基于密集神经网络的新冠肺炎诊断及迁移学习设置优化
Cognit Comput. 2021 Jan 18:1-17. doi: 10.1007/s12559-020-09776-8.
8
Integrating Machine Learning with Human Knowledge.将机器学习与人类知识相结合。
iScience. 2020 Oct 9;23(11):101656. doi: 10.1016/j.isci.2020.101656. eCollection 2020 Nov 20.
9
Fast screening for children's developmental language disorders via comprehensive speech ability evaluation-using a novel deep learning framework.通过综合言语能力评估——使用新型深度学习框架对儿童发育性语言障碍进行快速筛查。
Ann Transl Med. 2020 Jun;8(11):707. doi: 10.21037/atm-19-3097.
10
Leveraging Linguistic Context in Dyadic Interactions to Improve Automatic Speech Recognition for Children.在二元互动中利用语言语境来改进儿童自动语音识别
Comput Speech Lang. 2020 Sep;63. doi: 10.1016/j.csl.2020.101101. Epub 2020 Apr 16.

本文引用的文献

1
Representation learning: a review and new perspectives.表示学习:综述与新视角。
IEEE Trans Pattern Anal Mach Intell. 2013 Aug;35(8):1798-828. doi: 10.1109/TPAMI.2013.50.
2
Acoustics of children's speech: developmental changes of temporal and spectral parameters.儿童语音声学:时间和频谱参数的发育变化
J Acoust Soc Am. 1999 Mar;105(3):1455-68. doi: 10.1121/1.426686.

从成人到儿童的语音识别迁移学习:评估、分析与建议

Transfer Learning from Adult to Children for Speech Recognition: Evaluation, Analysis and Recommendations.

作者信息

Shivakumar Prashanth Gurunath, Georgiou Panayiotis

机构信息

Signal Processing for Communication Understanding & Behavior Analysis (SCUBA) Lab, University of Southern California, Los Angeles, California, USA.

出版信息

Comput Speech Lang. 2020 Sep;63. doi: 10.1016/j.csl.2020.101077. Epub 2020 Feb 18.

DOI:10.1016/j.csl.2020.101077
PMID:32372847
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7199459/
Abstract

Children speech recognition is challenging mainly due to the inherent high variability in children's physical and articulatory characteristics and expressions. This variability manifests in both acoustic constructs and linguistic usage due to the rapidly changing developmental stage in children's life. Part of the challenge is due to the lack of large amounts of available children speech data for efficient modeling. This work attempts to address the key challenges using transfer learning from adult's models to children's models in a Deep Neural Network (DNN) framework for children's Automatic Speech Recognition (ASR) task evaluating on multiple children's speech corpora with a large vocabulary. The paper presents a systematic and an extensive analysis of the proposed transfer learning technique considering the key factors affecting children's speech recognition from prior literature. are presented on (i) comparisons of earlier GMM-HMM and the newer DNN Models, (ii) effectiveness of standard adaptation techniques versus transfer learning, (iii) various adaptation configurations in tackling the variabilities present in children speech, in terms of (a) acoustic spectral variability, and (b) pronunciation variability and linguistic constraints. Our spans over (i) number of DNN model parameters (for adaptation), (ii) amount of adaptation data, (iii) ages of children, (iv) age dependent-independent adaptation. Finally, we provide on (i) the favorable strategies over various aforementioned - analyzed parameters, and (ii) potential future research directions and relevant challenges/problems persisting in DNN based ASR for children's speech.

摘要

儿童语音识别具有挑战性,主要是因为儿童的身体和发音特征及表达方式存在固有的高度变异性。由于儿童在成长过程中发育阶段快速变化,这种变异性在声学结构和语言使用中都有体现。部分挑战源于缺乏大量可用的儿童语音数据用于高效建模。这项工作试图在深度神经网络(DNN)框架下,通过从成人模型到儿童模型的迁移学习来解决关键挑战,以用于儿童自动语音识别(ASR)任务,该任务在多个具有大词汇量的儿童语音语料库上进行评估。本文基于先前文献中影响儿童语音识别的关键因素,对所提出的迁移学习技术进行了系统而广泛的分析。呈现了以下内容:(i)早期高斯混合模型 - 隐马尔可夫模型(GMM - HMM)与更新的DNN模型的比较;(ii)标准自适应技术与迁移学习的有效性;(iii)在应对儿童语音中存在的变异性方面的各种自适应配置,包括(a)声学频谱变异性和(b)发音变异性及语言限制。我们的研究涵盖了(i)DNN模型参数数量(用于自适应);(ii)自适应数据量;(iii)儿童年龄;(iv)年龄相关 - 无关自适应。最后,我们提供了关于(i)在上述各种分析参数上的有利策略,以及(ii)基于DNN的儿童语音ASR中潜在的未来研究方向和持续存在的相关挑战/问题的内容。