• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于深度学习的辅音-元音过渡模型用于发音的客观评估。

Consonant-Vowel Transition Models Based on Deep Learning for Objective Evaluation of Articulation.

作者信息

Mathad Vikram C, Liss Julie M, Chapman Kathy, Scherer Nancy, Berisha Visar

机构信息

zapr media labs, Bangalore, India, 560016.

College of Health Solutions, Arizona State University, Tempe, AZ-85287.

出版信息

IEEE/ACM Trans Audio Speech Lang Process. 2023;31:86-95. doi: 10.1109/taslp.2022.3209937. Epub 2022 Oct 10.

DOI:10.1109/taslp.2022.3209937
PMID:36712557
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9879020/
Abstract

Spectro-temporal dynamics of consonant-vowel (CV) transition regions are considered to provide robust cues related to articulation. In this work, we propose an objective measure of precise articulation, dubbed the objective articulation measure (OAM), by analyzing the CV transitions segmented around vowel onsets. The OAM is derived based on the posteriors of a convolutional neural network pre-trained to classify between different consonants using CV regions as input. We demonstrate that the OAM is correlated with perceptual measures in a variety of contexts including (a) adult dysarthric speech, (b) the speech of children with cleft lip/palate, and (c) a database of accented English speech from native Mandarin and Spanish speakers.

摘要

辅音-元音(CV)过渡区域的频谱-时间动态被认为能提供与发音相关的可靠线索。在这项工作中,我们通过分析在元音起始处分割出的CV过渡部分,提出了一种精确发音的客观度量方法,称为客观发音度量(OAM)。OAM是基于一个卷积神经网络的后验概率得出的,该网络经过预训练,以CV区域作为输入来对不同辅音进行分类。我们证明,在多种情况下,OAM与感知度量相关,这些情况包括:(a)成人构音障碍语音,(b)唇腭裂儿童的语音,以及(c)以普通话和西班牙语为母语的英语带口音语音数据库。

相似文献

1
Consonant-Vowel Transition Models Based on Deep Learning for Objective Evaluation of Articulation.基于深度学习的辅音-元音过渡模型用于发音的客观评估。
IEEE/ACM Trans Audio Speech Lang Process. 2023;31:86-95. doi: 10.1109/taslp.2022.3209937. Epub 2022 Oct 10.
2
Perception of the [m]-[n] distinction in consonant-vowel (CV) and vowel-consonant (VC) syllables produced by child and adult talkers.儿童和成人说话者所发出的辅音-元音(CV)和元音-辅音(VC)音节中[m]-[n]差异的感知。
J Acoust Soc Am. 2006 Mar;119(3):1697-711. doi: 10.1121/1.2140830.
3
Nasalance measures in Marathi consonant-vowel-consonant syllables with pressure consonants produced by children with and without cleft lip and palate.有唇腭裂和无唇腭裂儿童发出的带有塞音的马拉地语辅音-元音-辅音音节的鼻化度测量。
Cleft Palate Craniofac J. 2002 Jan;39(1):59-65. doi: 10.1597/1545-1569_2002_039_0059_nmimcv_2.0.co_2.
4
Effects of vowel context on the recognition of initial and medial consonants by cochlear implant users.元音语境对人工耳蜗使用者识别词首和词中辅音的影响。
Ear Hear. 2006 Dec;27(6):658-77. doi: 10.1097/01.aud.0000240543.31567.54.
5
Consonant accuracy in Mandarin-speaking children with repaired cleft palate.腭裂修复术后说普通话儿童的辅音准确性
Int J Pediatr Otorhinolaryngol. 2015 Dec;79(12):2270-6. doi: 10.1016/j.ijporl.2015.10.022. Epub 2015 Oct 30.
6
Production of two Nasal Sounds by Speakers with Cleft Palate.腭裂患者发出的两种鼻音
Cleft Palate Craniofac J. 2018 Jul;55(6):876-882. doi: 10.1597/16-096. Epub 2018 Feb 26.
7
Impact of the LSVT on vowel articulation and coarticulation in Parkinson's disease.李氏语音治疗法对帕金森病患者元音发音及协同发音的影响。
Clin Linguist Phon. 2015 Jun;29(6):424-40. doi: 10.3109/02699206.2015.1012301. Epub 2015 Feb 17.
8
An Electropalatographic Study of Variability in Arrernte Consonant Production.阿兰特语辅音发音变异性的电腭图研究。
Phonetica. 2019;76(6):399-428. doi: 10.1159/000496409. Epub 2019 May 2.
9
Changes in vowel articulation with subthalamic nucleus deep brain stimulation in dysarthric speakers with Parkinson's disease.帕金森病构音障碍患者丘脑底核深部脑刺激术后元音发音的变化
Parkinsons Dis. 2014;2014:487035. doi: 10.1155/2014/487035. Epub 2014 Oct 21.
10
Influence of timing of delayed hard palate closure on articulation skills in 3-year-old Danish children with unilateral cleft lip and palate.延迟硬腭关闭时机对3岁丹麦单侧唇腭裂儿童发音技能的影响。
Int J Lang Commun Disord. 2018 Jan;53(1):130-143. doi: 10.1111/1460-6984.12331. Epub 2017 Jul 25.

引用本文的文献

1
Artificial Intelligence Applications in Pediatric Craniofacial Surgery.人工智能在小儿颅颌面外科的应用
Diagnostics (Basel). 2025 Mar 25;15(7):829. doi: 10.3390/diagnostics15070829.
2
Responsible development of clinical speech AI: Bridging the gap between clinical research and technology.临床语音人工智能的负责任开发:弥合临床研究与技术之间的差距。
NPJ Digit Med. 2024 Aug 9;7(1):208. doi: 10.1038/s41746-024-01199-1.
3
Dysarthria detection based on a deep learning model with a clinically-interpretable layer.基于具有临床可解释层的深度学习模型的构音障碍检测。
JASA Express Lett. 2023 Jan;3(1):015201. doi: 10.1121/10.0016833.

本文引用的文献

1
Digital medicine and the curse of dimensionality.数字医学与维度诅咒
NPJ Digit Med. 2021 Oct 28;4(1):153. doi: 10.1038/s41746-021-00521-5.
2
Robust Estimation of Hypernasality in Dysarthria with Acoustic Model Likelihood Features.基于声学模型似然特征的构音障碍鼻音过重的稳健估计
IEEE/ACM Trans Audio Speech Lang Process. 2020;28:2511-2522. doi: 10.1109/taslp.2020.3015035. Epub 2020 Aug 7.
3
Performance of Forced-Alignment Algorithms on Children's Speech.强制对齐算法在儿童语音上的性能
J Speech Lang Hear Res. 2021 Jun 18;64(6S):2213-2222. doi: 10.1044/2020_JSLHR-20-00268. Epub 2021 Mar 11.
4
Repeatability of Commonly Used Speech and Language Features for Clinical Applications.临床应用中常用言语和语言特征的可重复性
Digit Biomark. 2020 Dec 2;4(3):109-122. doi: 10.1159/000511671. eCollection 2020 Sep-Dec.
5
OBJECTIVE MEASURES OF PLOSIVE NASALIZATION IN HYPERNASAL SPEECH.高鼻音语音中爆破音鼻音化的客观测量
Proc IEEE Int Conf Acoust Speech Signal Process. 2019 May;2019:6520-6524. doi: 10.1109/ICASSP.2019.8682339. Epub 2019 Apr 17.
6
Intelligibility assessment of cleft lip and palate speech using Gaussian posteriograms based on joint spectro-temporal features.基于联合谱-时间特征的高斯后验图对唇腭裂语音清晰度的评估
J Acoust Soc Am. 2018 Oct;144(4):2413. doi: 10.1121/1.5064463.
7
Acoustic and perceptual evaluation of category goodness of /t/ and /k/ in typical and misarticulated children's speech.典型儿童和发音错误儿童语音中/t/和/k/音类良好度的声学和感知评估。
J Acoust Soc Am. 2015 Jun;137(6):3422-35. doi: 10.1121/1.4921033.
8
The Americleft Speech Project: A Training and Reliability Study.美国腭裂语音项目:一项培训与信度研究。
Cleft Palate Craniofac J. 2016 Jan;53(1):93-108. doi: 10.1597/14-027. Epub 2014 Dec 22.
9
Frequency of consonant articulation errors in dysarthric speech.构音障碍性言语中辅音发音错误的频率。
Clin Linguist Phon. 2010 Oct;24(10):759-70. doi: 10.3109/02699206.2010.497238.
10
Universal parameters for reporting speech outcomes in individuals with cleft palate.腭裂患者言语结果报告的通用参数。
Cleft Palate Craniofac J. 2008 Jan;45(1):1-17. doi: 10.1597/06-086.1.