• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于韵律驱动头部手势动画的头部手势和韵律模式分析。

Analysis of head gesture and prosody patterns for prosody-driven head-gesture animation.

作者信息

Sargin Mehmet E, Yemez Yucel, Erzin Engin, Tekalp Ahmet M

机构信息

Department of Electrical and Computer Engineering, University of California-Santa Barbara, Santa Barbara, CA 93106-9560, USA.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2008 Aug;30(8):1330-45. doi: 10.1109/TPAMI.2007.70797.

DOI:10.1109/TPAMI.2007.70797
PMID:18566489
Abstract

We propose a new two-stage framework for joint analysis of head gesture and speech prosody patterns of a speaker towards automatic realistic synthesis of head gestures from speech prosody. In the first stage analysis, we perform Hidden Markov Model (HMM) based unsupervised temporal segmentation of head gesture and speech prosody features separately to determine elementary head gesture and speech prosody patterns, respectively, for a particular speaker. In the second stage, joint analysis of correlations between these elementary head gesture and prosody patterns is performed using Multi-Stream HMMs to determine an audio-visual mapping model. The resulting audio-visual mapping model is then employed to synthesize natural head gestures from arbitrary input test speech given a head model for the speaker. In the synthesis stage, the audio-visual mapping model is used to predict a sequence of gesture patterns from the prosody pattern sequence computed for the input test speech. The Euler angles associated with each gesture pattern are then applied to animate the speaker head model. Objective and subjective evaluations indicate that the proposed synthesis by analysis scheme provides natural looking head gestures for the speaker with any input test speech, as well as in "prosody transplant" and gesture transplant" scenarios.

摘要

我们提出了一种新的两阶段框架,用于联合分析说话者的头部手势和语音韵律模式,以从语音韵律自动生成逼真的头部手势合成。在第一阶段分析中,我们分别对头部手势和语音韵律特征执行基于隐马尔可夫模型(HMM)的无监督时间分割,以分别确定特定说话者的基本头部手势和语音韵律模式。在第二阶段,使用多流HMM对这些基本头部手势和韵律模式之间的相关性进行联合分析,以确定视听映射模型。然后,给定说话者的头部模型,使用所得的视听映射模型从任意输入测试语音合成自然的头部手势。在合成阶段,视听映射模型用于从为输入测试语音计算的韵律模式序列预测手势模式序列。然后将与每个手势模式相关联的欧拉角应用于对说话者头部模型进行动画处理。客观和主观评估表明,所提出的通过分析方案进行的合成,对于任何输入测试语音,以及在“韵律移植”和“手势移植”场景中,都能为说话者提供看起来自然的头部手势。

相似文献

1
Analysis of head gesture and prosody patterns for prosody-driven head-gesture animation.用于韵律驱动头部手势动画的头部手势和韵律模式分析。
IEEE Trans Pattern Anal Mach Intell. 2008 Aug;30(8):1330-45. doi: 10.1109/TPAMI.2007.70797.
2
A unified framework for gesture recognition and spatiotemporal gesture segmentation.用于手势识别和时空手势分割的统一框架。
IEEE Trans Pattern Anal Mach Intell. 2009 Sep;31(9):1685-99. doi: 10.1109/TPAMI.2008.203.
3
Accurate visible speech synthesis based on concatenating variable length motion capture data.基于拼接可变长度动作捕捉数据的精确可视语音合成。
IEEE Trans Vis Comput Graph. 2006 Mar-Apr;12(2):266-76. doi: 10.1109/TVCG.2006.18.
4
Head yaw estimation from asymmetry of facial appearance.基于面部外观不对称性的头部偏航估计。
IEEE Trans Syst Man Cybern B Cybern. 2008 Dec;38(6):1501-12. doi: 10.1109/TSMCB.2008.928231.
5
Tracking the visual focus of attention for a varying number of wandering people.跟踪不同数量游荡人员的视觉注意力焦点。
IEEE Trans Pattern Anal Mach Intell. 2008 Jul;30(7):1212-29. doi: 10.1109/TPAMI.2007.70773.
6
Expressive facial animation synthesis by learning speech coarticulation and expression spaces.通过学习语音协同发音和表情空间实现表情丰富的面部动画合成。
IEEE Trans Vis Comput Graph. 2006 Nov-Dec;12(6):1523-34. doi: 10.1109/TVCG.2006.90.
7
Spectral matting.光谱抠图
IEEE Trans Pattern Anal Mach Intell. 2008 Oct;30(10):1699-712. doi: 10.1109/TPAMI.2008.168.
8
Deformation modeling for robust 3D face matching.用于稳健3D人脸匹配的变形建模
IEEE Trans Pattern Anal Mach Intell. 2008 Aug;30(8):1346-57. doi: 10.1109/TPAMI.2007.70784.
9
Unsupervised fully automated inline analysis of global left ventricular function in CINE MR imaging.电影磁共振成像中左心室整体功能的无监督全自动在线分析。
Invest Radiol. 2009 Aug;44(8):463-8. doi: 10.1097/RLI.0b013e3181aaf429.
10
Online gesture spotting from visual hull data.基于视觉体素数据的在线手势识别。
IEEE Trans Pattern Anal Mach Intell. 2011 Jun;33(6):1175-88. doi: 10.1109/TPAMI.2010.199.

引用本文的文献

1
Zero-shot style transfer for gesture animation driven by text and speech using adversarial disentanglement of multimodal style encoding.利用多模态风格编码的对抗解缠实现由文本和语音驱动的手势动画的零样本风格迁移。
Front Artif Intell. 2023 Jun 12;6:1142997. doi: 10.3389/frai.2023.1142997. eCollection 2023.
2
Automating the Production of Communicative Gestures in Embodied Characters.在具身角色中实现交际手势的自动化生成。
Front Psychol. 2018 Jul 9;9:1144. doi: 10.3389/fpsyg.2018.01144. eCollection 2018.
3
Head Motion Modeling for Human Behavior Analysis in Dyadic Interaction.
用于二元互动中人类行为分析的头部运动建模
IEEE Trans Multimedia. 2015 Jul 13;17(7):1107-1119. doi: 10.1109/TMM.2015.2432671. Epub 2015 May 13.
4
Hidden Markov model analysis of maternal behavior patterns in inbred and reciprocal hybrid mice.近交系和回交系小鼠母性行为模式的隐马尔可夫模型分析。
PLoS One. 2011 Mar 8;6(3):e14753. doi: 10.1371/journal.pone.0014753.