Suppr超能文献

基于深度学习的辅音-元音过渡模型用于发音的客观评估。

Consonant-Vowel Transition Models Based on Deep Learning for Objective Evaluation of Articulation.

作者信息

Mathad Vikram C, Liss Julie M, Chapman Kathy, Scherer Nancy, Berisha Visar

机构信息

zapr media labs, Bangalore, India, 560016.

College of Health Solutions, Arizona State University, Tempe, AZ-85287.

出版信息

IEEE/ACM Trans Audio Speech Lang Process. 2023;31:86-95. doi: 10.1109/taslp.2022.3209937. Epub 2022 Oct 10.

Abstract

Spectro-temporal dynamics of consonant-vowel (CV) transition regions are considered to provide robust cues related to articulation. In this work, we propose an objective measure of precise articulation, dubbed the objective articulation measure (OAM), by analyzing the CV transitions segmented around vowel onsets. The OAM is derived based on the posteriors of a convolutional neural network pre-trained to classify between different consonants using CV regions as input. We demonstrate that the OAM is correlated with perceptual measures in a variety of contexts including (a) adult dysarthric speech, (b) the speech of children with cleft lip/palate, and (c) a database of accented English speech from native Mandarin and Spanish speakers.

摘要

辅音-元音(CV)过渡区域的频谱-时间动态被认为能提供与发音相关的可靠线索。在这项工作中,我们通过分析在元音起始处分割出的CV过渡部分,提出了一种精确发音的客观度量方法,称为客观发音度量(OAM)。OAM是基于一个卷积神经网络的后验概率得出的,该网络经过预训练,以CV区域作为输入来对不同辅音进行分类。我们证明,在多种情况下,OAM与感知度量相关,这些情况包括:(a)成人构音障碍语音,(b)唇腭裂儿童的语音,以及(c)以普通话和西班牙语为母语的英语带口音语音数据库。

相似文献

5
Consonant accuracy in Mandarin-speaking children with repaired cleft palate.腭裂修复术后说普通话儿童的辅音准确性
Int J Pediatr Otorhinolaryngol. 2015 Dec;79(12):2270-6. doi: 10.1016/j.ijporl.2015.10.022. Epub 2015 Oct 30.
6
Production of two Nasal Sounds by Speakers with Cleft Palate.腭裂患者发出的两种鼻音
Cleft Palate Craniofac J. 2018 Jul;55(6):876-882. doi: 10.1597/16-096. Epub 2018 Feb 26.

本文引用的文献

1
Digital medicine and the curse of dimensionality.数字医学与维度诅咒
NPJ Digit Med. 2021 Oct 28;4(1):153. doi: 10.1038/s41746-021-00521-5.
2
Robust Estimation of Hypernasality in Dysarthria with Acoustic Model Likelihood Features.基于声学模型似然特征的构音障碍鼻音过重的稳健估计
IEEE/ACM Trans Audio Speech Lang Process. 2020;28:2511-2522. doi: 10.1109/taslp.2020.3015035. Epub 2020 Aug 7.
3
Performance of Forced-Alignment Algorithms on Children's Speech.强制对齐算法在儿童语音上的性能
J Speech Lang Hear Res. 2021 Jun 18;64(6S):2213-2222. doi: 10.1044/2020_JSLHR-20-00268. Epub 2021 Mar 11.
5
OBJECTIVE MEASURES OF PLOSIVE NASALIZATION IN HYPERNASAL SPEECH.高鼻音语音中爆破音鼻音化的客观测量
Proc IEEE Int Conf Acoust Speech Signal Process. 2019 May;2019:6520-6524. doi: 10.1109/ICASSP.2019.8682339. Epub 2019 Apr 17.
8
The Americleft Speech Project: A Training and Reliability Study.美国腭裂语音项目:一项培训与信度研究。
Cleft Palate Craniofac J. 2016 Jan;53(1):93-108. doi: 10.1597/14-027. Epub 2014 Dec 22.
9

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验