• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于情景记忆的声学-发音反转问题解决方案。

An episodic memory-based solution for the acoustic-to-articulatory inversion problem.

机构信息

Université de Lorraine, Laboratoire Lorrain de Recherche en Informatique et ses Applications, Unité de Recherche Mixte 7503, Vandœuvre-lès-Nancy, F-54506, France.

出版信息

J Acoust Soc Am. 2013 May;133(5):2921-30. doi: 10.1121/1.4798665.

DOI:10.1121/1.4798665
PMID:23654397
Abstract

This paper presents an acoustic-to-articulatory inversion method based on an episodic memory. An episodic memory is an interesting model for two reasons. First, it does not rely on any assumptions about the mapping function but rather it relies on real synchronized acoustic and articulatory data streams. Second, the memory inherently represents the real articulatory dynamics as observed. It is argued that the computational models of episodic memory, as they are usually designed, cannot provide a satisfying solution for the acoustic-to-articulatory inversion problem due to the insufficient quantity of training data. Therefore, an episodic memory is proposed, called generative episodic memory (G-Mem), which is able to produce articulatory trajectories that do not belong to the set of episodes the memory is based on. The generative episodic memory is evaluated using two electromagnetic articulography corpora: one for English and one for French. Comparisons with a codebook-based method and with a classical episodic memory (which is termed concatenative episodic memory) are presented in order to evaluate the proposed generative episodic memory in terms of both its modeling of articulatory dynamics and its generalization capabilities. The results show the effectiveness of the method where an overall root-mean-square error of 1.65 mm and a correlation of 0.71 are obtained for the G-Mem method. They are comparable to those of methods recently proposed.

摘要

本文提出了一种基于情节记忆的声学到发音的反转方法。情节记忆有两个有趣的原因。首先,它不依赖于任何关于映射函数的假设,而是依赖于真实的同步声学和发音数据流。其次,记忆本身表现出如观察到的真实发音动态。有人认为,由于训练数据的数量不足,情节记忆的计算模型通常无法为声发到发音的反转问题提供令人满意的解决方案。因此,提出了一种称为生成式情节记忆(G-Mem)的情节记忆,它能够产生不属于记忆所基于的情节集的发音轨迹。使用两个电磁发音语料库对生成式情节记忆进行了评估:一个用于英语,一个用于法语。为了评估所提出的生成式情节记忆在发音动态建模及其泛化能力方面的性能,与基于码本的方法和经典的情节记忆(称为串联式情节记忆)进行了比较。结果表明了该方法的有效性,其中 G-Mem 方法的整体均方根误差为 1.65 毫米,相关性为 0.71。这些结果与最近提出的方法相当。

相似文献

1
An episodic memory-based solution for the acoustic-to-articulatory inversion problem.基于情景记忆的声学-发音反转问题解决方案。
J Acoust Soc Am. 2013 May;133(5):2921-30. doi: 10.1121/1.4798665.
2
Improved speech inversion using general regression neural network.使用通用回归神经网络改进语音反转
J Acoust Soc Am. 2015 Sep;138(3):EL229-35. doi: 10.1121/1.4929626.
3
Automatic measurement of voice onset time using discriminative structured prediction.基于判别结构预测的语音起始时间自动测量。
J Acoust Soc Am. 2012 Dec;132(6):3965-79. doi: 10.1121/1.4763995.
4
Incorporation of phonetic constraints in acoustic-to-articulatory inversion.在声学到发音逆向转换中纳入语音约束。
J Acoust Soc Am. 2008 Apr;123(4):2310-23. doi: 10.1121/1.2885747.
5
The optimal ratio time-frequency mask for speech separation in terms of the signal-to-noise ratio.基于信噪比的语音分离最优时频掩蔽比。
J Acoust Soc Am. 2013 Nov;134(5):EL452-8. doi: 10.1121/1.4824632.
6
Modeling the articulatory space using a hypercube codebook for acoustic-to-articulatory inversion.使用超立方码本对发音空间进行建模以实现声学到发音的反转。
J Acoust Soc Am. 2005 Jul;118(1):444-60. doi: 10.1121/1.1921448.
7
On smoothing articulatory trajectories obtained from Gaussian mixture model based acoustic-to-articulatory inversion.基于高斯混合模型的声学-发音反向的发音轨迹平滑处理。
J Acoust Soc Am. 2013 Aug;134(2):EL258-64. doi: 10.1121/1.4813590.
8
The effect of articulatory adjustment on reducing hypernasality.构音调整对减少超鼻音的效果。
J Speech Lang Hear Res. 2012 Oct;55(5):1438-48. doi: 10.1044/1092-4388(2012/11-0142). Epub 2012 Mar 12.
9
Automatic speech recognition using articulatory features from subject-independent acoustic-to-articulatory inversion.基于与主体无关的声学-发音反转的发音特征的自动语音识别。
J Acoust Soc Am. 2011 Oct;130(4):EL251-7. doi: 10.1121/1.3634122.
10
Automatic intelligibility assessment of speakers after laryngeal cancer by means of acoustic modeling.通过声学建模实现喉癌患者语音可懂度的自动评估。
J Voice. 2012 May;26(3):390-7. doi: 10.1016/j.jvoice.2011.04.010. Epub 2011 Aug 5.