一种从语音声学估算手势分数的方法。

A procedure for estimating gestural scores from speech acoustics.

机构信息

Haskins Laboratories, 300 George Street, Suite 900, New Haven, Connecticut 06511, USA.

出版信息

J Acoust Soc Am. 2012 Dec;132(6):3980-9. doi: 10.1121/1.4763545.

DOI:10.1121/1.4763545

PMID:23231127

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3528686/

Abstract

Speech can be represented as a constellation of constricting vocal tract actions called gestures, whose temporal patterning with respect to one another is expressed in a gestural score. Current speech datasets do not come with gestural annotation and no formal gestural annotation procedure exists at present. This paper describes an iterative analysis-by-synthesis landmark-based time-warping architecture to perform gestural annotation of natural speech. For a given utterance, the Haskins Laboratories Task Dynamics and Application (TADA) model is employed to generate a corresponding prototype gestural score. The gestural score is temporally optimized through an iterative timing-warping process such that the acoustic distance between the original and TADA-synthesized speech is minimized. This paper demonstrates that the proposed iterative approach is superior to conventional acoustically-referenced dynamic timing-warping procedures and provides reliable gestural annotation for speech datasets.

摘要

言语可以表示为一系列称为姿势的声道收缩动作的组合，这些动作相对于彼此的时间模式在手势谱中得到表达。目前的语音数据集没有手势注释，目前也没有正式的手势注释程序。本文描述了一种基于迭代分析-综合地标时间 warp 的架构，用于对自然语音进行手势注释。对于给定的话语，哈斯金斯实验室任务动态和应用（TADA）模型被用来生成一个相应的原型手势谱。通过迭代时间 warp 过程对手势谱进行时间优化，使得原始语音和 TADA 合成语音之间的声学距离最小化。本文证明，所提出的迭代方法优于传统的声学参考动态时间 warp 方法，并为语音数据集提供了可靠的手势注释。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

一种从语音声学估算手势分数的方法。

A procedure for estimating gestural scores from speech acoustics.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

一种从语音声学估算手势分数的方法。

A procedure for estimating gestural scores from speech acoustics.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献