• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用链矩阵和前田发音模型通过分析-综合进行语音声发音反转的研究。

A study of acoustic-to-articulatory inversion of speech by analysis-by-synthesis using chain matrices and the Maeda articulatory model.

机构信息

Department of Electrical Engineering, University of California, Los Angeles, California 90095, USA.

出版信息

J Acoust Soc Am. 2011 Apr;129(4):2144-62. doi: 10.1121/1.3514544.

DOI:10.1121/1.3514544
PMID:21476670
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3188964/
Abstract

In this paper, a quantitative study of acoustic-to-articulatory inversion for vowel speech sounds by analysis-by-synthesis using the Maeda articulatory model is performed. For chain matrix calculation of vocal tract (VT) acoustics, the chain matrix derivatives with respect to area function are calculated and used in a quasi-Newton method for optimizing articulatory trajectories. The cost function includes a distance measure between natural and synthesized first three formants, and parameter regularization and continuity terms. Calibration of the Maeda model to two speakers, one male and one female, from the University of Wisconsin x-ray microbeam (XRMB) database, using a cost function, is discussed. Model adaptation includes scaling the overall VT and the pharyngeal region and modifying the outer VT outline using measured palate and pharyngeal traces. The inversion optimization is initialized by a fast search of an articulatory codebook, which was pruned using XRMB data to improve inversion results. Good agreement between estimated midsagittal VT outlines and measured XRMB tongue pellet positions was achieved for several vowels and diphthongs for the male speaker, with average pellet-VT outline distances around 0.15 cm, smooth articulatory trajectories, and less than 1% average error in the first three formants.

摘要

本文通过使用前田发音模型的分析-综合方法,对元音语音的声学-发音反转进行了定量研究。对于声道(VT)声学的链式矩阵计算,计算了关于面积函数的链式矩阵导数,并将其用于准牛顿法来优化发音轨迹。代价函数包括自然和合成的前三个共振峰之间的距离度量,以及参数正则化和连续性项。讨论了使用代价函数对来自威斯康星大学 X 射线微束(XRMB)数据库的一男一女两位说话者对前田模型的校准。模型自适应包括缩放整体 VT 和咽区,并使用测量的腭和咽迹来修改外部 VT 轮廓。反转优化通过发音代码本的快速搜索来初始化,该代码本使用 XRMB 数据进行了修剪,以提高反转结果。对于男性说话者的几个元音和双元音,估计的中矢状 VT 轮廓与测量的 XRMB 舌丸位置之间的吻合度较好,发音轨迹平滑,前三个共振峰的平均误差小于 1%。

相似文献

1
A study of acoustic-to-articulatory inversion of speech by analysis-by-synthesis using chain matrices and the Maeda articulatory model.使用链矩阵和前田发音模型通过分析-综合进行语音声发音反转的研究。
J Acoust Soc Am. 2011 Apr;129(4):2144-62. doi: 10.1121/1.3514544.
2
Vocal tract representation in the recognition of cerebral palsied speech.声道特征在脑瘫语音识别中的应用。
J Speech Lang Hear Res. 2012 Aug;55(4):1190-207. doi: 10.1044/1092-4388(2011/11-0223). Epub 2012 Jan 23.
3
Vocal tract normalization for midsagittal articulatory recovery with analysis-by-synthesis.基于合成分析的矢状面中部发音恢复的声道归一化
J Acoust Soc Am. 1999 Aug;106(2):1090-105. doi: 10.1121/1.427117.
4
A modular architecture for articulatory synthesis from gestural specification.基于运动学规范的发音合成的模块化架构。
J Acoust Soc Am. 2019 Dec;146(6):4458. doi: 10.1121/1.5139413.
5
Modeling the articulatory space using a hypercube codebook for acoustic-to-articulatory inversion.使用超立方码本对发音空间进行建模以实现声学到发音的反转。
J Acoust Soc Am. 2005 Jul;118(1):444-60. doi: 10.1121/1.1921448.
6
Acquisition of vowel articulation in childhood investigated by acoustic-to-articulatory inversion.通过声学-发音反演研究儿童元音发音的习得。
Infant Behav Dev. 2017 Feb;46:178-193. doi: 10.1016/j.infbeh.2017.01.007. Epub 2017 Feb 20.
7
Articulatory distinctiveness of vowels and consonants: a data-driven approach.元音和辅音的发音区别:一种数据驱动的方法。
J Speech Lang Hear Res. 2013 Oct;56(5):1539-51. doi: 10.1044/1092-4388(2013/12-0030). Epub 2013 Jul 9.
8
Variability of articulator positions and formants across nine English vowels.九个英语元音的发音器官位置和共振峰的变异性。
J Phon. 2018 May;68:1-14. doi: 10.1016/j.wocn.2018.01.003. Epub 2018 Feb 23.
9
Modeling the effect of palate shape on the articulatory-acoustics mapping.建立腭形对发音声学映射影响的模型。
J Acoust Soc Am. 2018 Jul;144(1):EL71. doi: 10.1121/1.5048043.
10
An evaluation of articulatory working space area in vowel production of adults with Down syndrome.唐氏综合征成人元音发音中发音工作空间区域的评估。
Clin Linguist Phon. 2011 Apr;25(4):321-34. doi: 10.3109/02699206.2010.535647. Epub 2010 Nov 22.

引用本文的文献

1
Discrete constriction locations describe a comprehensive range of vocal tract shapes in the Maeda model.在前田模型中,离散的收缩位置描述了一系列广泛的声道形状。
JASA Express Lett. 2021 Dec;1(12):124402. doi: 10.1121/10.0009058. Epub 2021 Dec 28.
2
Pathological Voice Source Analysis System Using a Flow Waveform-Matched Biomechanical Model.基于血流波形匹配生物力学模型的病理性声源分析系统
Appl Bionics Biomech. 2018 Jul 2;2018:3158439. doi: 10.1155/2018/3158439. eCollection 2018.
3
Statistical Methods for Estimation of Direct and Differential Kinematics of the Vocal Tract.用于估计声道直接运动学和微分运动学的统计方法。
Speech Commun. 2013 Jan;55(1):147-161. doi: 10.1016/j.specom.2012.08.001.

本文引用的文献

1
Incorporation of phonetic constraints in acoustic-to-articulatory inversion.在声学到发音逆向转换中纳入语音约束。
J Acoust Soc Am. 2008 Apr;123(4):2310-23. doi: 10.1121/1.2885747.
2
Modeling the articulatory space using a hypercube codebook for acoustic-to-articulatory inversion.使用超立方码本对发音空间进行建模以实现声学到发音的反转。
J Acoust Soc Am. 2005 Jul;118(1):444-60. doi: 10.1121/1.1921448.
3
Determination of the vocal-tract shape from measured formant frequencies.根据测得的共振峰频率确定声道形状。
J Acoust Soc Am. 1967 May;41(5):1283-94. doi: 10.1121/1.1910470.
4
Determination of the geometry of the human vocal tract by acoustic measurements.通过声学测量确定人类声道的几何形状。
J Acoust Soc Am. 1967 Apr;41(4):Suppl:1002-10. doi: 10.1121/1.1910429.
5
Model for wave propagation in a lossy vocal tract.有损声道中波传播的模型。
J Acoust Soc Am. 1974 May;55(5):1070-5. doi: 10.1121/1.1914649.
6
Articulatory model for the study of speech production.用于语音产生研究的发音模型。
J Acoust Soc Am. 1973 Apr;53(4):1070-82. doi: 10.1121/1.1913427.
7
Inferring articulation and recognizing gestures from acoustics with a neural network trained on x-ray microbeam data.利用在X射线微束数据上训练的神经网络从声学中推断发音并识别手势。
J Acoust Soc Am. 1992 Aug;92(2 Pt 1):688-700. doi: 10.1121/1.403994.
8
Inversion of articulatory-to-acoustic transformation in the vocal tract by a computer-sorting technique.通过计算机分类技术实现声道中发音到声学转换的反转。
J Acoust Soc Am. 1978 May;63(5):1535-53. doi: 10.1121/1.381848.