• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
A procedure for estimating gestural scores from speech acoustics.一种从语音声学估算手势分数的方法。
J Acoust Soc Am. 2012 Dec;132(6):3980-9. doi: 10.1121/1.4763545.
2
Estimating the spectral tilt of the glottal source from telephone speech using a deep neural network.使用深度神经网络从电话语音中估计声门源的频谱倾斜度。
J Acoust Soc Am. 2017 Apr;141(4):EL327. doi: 10.1121/1.4979162.
3
A modular architecture for articulatory synthesis from gestural specification.基于运动学规范的发音合成的模块化架构。
J Acoust Soc Am. 2019 Dec;146(6):4458. doi: 10.1121/1.5139413.
4
Robust speaker identification via fusion of subglottal resonances and cepstral features.通过声门下共振与倒谱特征融合实现稳健的说话人识别
J Acoust Soc Am. 2017 Apr;141(4):EL420. doi: 10.1121/1.4979841.
5
Toward clinical application of landmark-based speech analysis: Landmark expression in normal adult speech.迈向基于界标的语音分析的临床应用:正常成人语音中的界标表达。
J Acoust Soc Am. 2017 Nov;142(5):EL441. doi: 10.1121/1.5009687.
6
Acoustic analysis of misarticulated trills in cleft lip and palate children.腭裂儿童构音不清颤音的声学分析。
J Acoust Soc Am. 2018 Jun;143(6):EL474. doi: 10.1121/1.5042339.
7
Effect of glottal dynamics in the production of shouted speech.声门动力学在呼喊语音产生中的作用。
J Acoust Soc Am. 2013 May;133(5):3050-61. doi: 10.1121/1.4796110.
8
Recognizing articulatory gestures from speech for robust speech recognition.从语音中识别发音动作以实现鲁棒的语音识别。
J Acoust Soc Am. 2012 Mar;131(3):2270-87. doi: 10.1121/1.3682038.
9
Vocal tract acoustics.声道声学
J Voice. 1993 Jun;7(2):97-117. doi: 10.1016/s0892-1997(05)80339-x.
10
Analysis of Measured and Simulated Supraglottal Acoustic Waves.测量与模拟的声门上声波分析
J Voice. 2016 Sep;30(5):518-28. doi: 10.1016/j.jvoice.2015.08.006. Epub 2015 Sep 14.

引用本文的文献

1
Speech Sound Disorders in Children: An Articulatory Phonology Perspective.儿童语音障碍:发音音系学视角
Front Psychol. 2020 Jan 28;10:2998. doi: 10.3389/fpsyg.2019.02998. eCollection 2019.
2
The FACTS model of speech motor control: Fusing state estimation and task-based control.言语运动控制的 FACTS 模型:融合状态估计和基于任务的控制。
PLoS Comput Biol. 2019 Sep 3;15(9):e1007321. doi: 10.1371/journal.pcbi.1007321. eCollection 2019 Sep.
3
Differential Representation of Articulatory Gestures and Phonemes in Precentral and Inferior Frontal Gyri.前中央回和下额前回中发音动作和音位的差异表达。
J Neurosci. 2018 Nov 14;38(46):9803-9813. doi: 10.1523/JNEUROSCI.1206-18.2018. Epub 2018 Sep 26.
4
Statistical Methods for Estimation of Direct and Differential Kinematics of the Vocal Tract.用于估计声道直接运动学和微分运动学的统计方法。
Speech Commun. 2013 Jan;55(1):147-161. doi: 10.1016/j.specom.2012.08.001.
5
Spatio-temporal articulatory movement primitives during speech production: extraction, interpretation, and validation.言语产生过程中的时空发音运动基元:提取、解释和验证。
J Acoust Soc Am. 2013 Aug;134(2):1378-94. doi: 10.1121/1.4812765.

本文引用的文献

1
Retrieving Tract Variables From Acoustics: A Comparison of Different Machine Learning Strategies.从声学中获取声道变量:不同机器学习策略的比较
IEEE J Sel Top Signal Process. 2010 Sep 13;4(6):1027-1045. doi: 10.1109/JSTSP.2010.2076013.
2
Bridging planning and execution: Temporal planning of syllables.衔接规划与执行:音节的时间规划
J Phon. 2012 May 1;40(3):374-389. doi: 10.1016/j.wocn.2012.02.002.
3
Timing effects of syllable structure and stress on nasals: a real-time MRI examination.音节结构和重音对鼻音的时间效应:一项实时MRI检查
J Phon. 2009 Jan 1;37(1):97-110. doi: 10.1016/j.wocn.2008.10.002.
4
A probabilistic framework for landmark detection based on phonetic features for automatic speech recognition.一种基于语音特征的地标检测概率框架,用于自动语音识别。
J Acoust Soc Am. 2008 Feb;123(2):1154-68. doi: 10.1121/1.2823754.
5
Prosodic strengthening and featural enhancement: evidence from acoustic and articulatory realizations of /a,i/ in English.韵律强化与特征增强:来自英语中/a,i/的声学和发音实现的证据。
J Acoust Soc Am. 2005 Jun;117(6):3867-78. doi: 10.1121/1.1861893.
6
A quasiarticulatory approach to controlling acoustic source parameters in a Klatt-type formant synthesizer using HLsyn.一种使用HLsyn在克拉特型共振峰合成器中控制声源参数的准发音方法。
J Acoust Soc Am. 2002 Sep;112(3 Pt 1):1158-82. doi: 10.1121/1.1498851.
7
An overlapping-feature-based phonological model incorporating linguistic constraints: applications to speech recognition.
J Acoust Soc Am. 2002 Feb;111(2):1086-101. doi: 10.1121/1.1420380.
8
Articulatory strengthening at edges of prosodic domains.韵律域边缘的发音强化。
J Acoust Soc Am. 1997 Jun;101(6):3728-40. doi: 10.1121/1.418332.
9
The supraglottal articulation of prominence in English: linguistic stress as localized hyperarticulation.英语中声门上突出的发音:作为局部过度发音的语言重音
J Acoust Soc Am. 1995 Jan;97(1):491-504. doi: 10.1121/1.412275.
10
On the role of spectral transition for speech perception.
J Acoust Soc Am. 1986 Oct;80(4):1016-25. doi: 10.1121/1.393842.

一种从语音声学估算手势分数的方法。

A procedure for estimating gestural scores from speech acoustics.

机构信息

Haskins Laboratories, 300 George Street, Suite 900, New Haven, Connecticut 06511, USA.

出版信息

J Acoust Soc Am. 2012 Dec;132(6):3980-9. doi: 10.1121/1.4763545.

DOI:10.1121/1.4763545
PMID:23231127
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3528686/
Abstract

Speech can be represented as a constellation of constricting vocal tract actions called gestures, whose temporal patterning with respect to one another is expressed in a gestural score. Current speech datasets do not come with gestural annotation and no formal gestural annotation procedure exists at present. This paper describes an iterative analysis-by-synthesis landmark-based time-warping architecture to perform gestural annotation of natural speech. For a given utterance, the Haskins Laboratories Task Dynamics and Application (TADA) model is employed to generate a corresponding prototype gestural score. The gestural score is temporally optimized through an iterative timing-warping process such that the acoustic distance between the original and TADA-synthesized speech is minimized. This paper demonstrates that the proposed iterative approach is superior to conventional acoustically-referenced dynamic timing-warping procedures and provides reliable gestural annotation for speech datasets.

摘要

言语可以表示为一系列称为姿势的声道收缩动作的组合,这些动作相对于彼此的时间模式在手势谱中得到表达。目前的语音数据集没有手势注释,目前也没有正式的手势注释程序。本文描述了一种基于迭代分析-综合地标时间 warp 的架构,用于对自然语音进行手势注释。对于给定的话语,哈斯金斯实验室任务动态和应用(TADA)模型被用来生成一个相应的原型手势谱。通过迭代时间 warp 过程对手势谱进行时间优化,使得原始语音和 TADA 合成语音之间的声学距离最小化。本文证明,所提出的迭代方法优于传统的声学参考动态时间 warp 方法,并为语音数据集提供了可靠的手势注释。