• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于几何引导的音频顶点注意力的姿态感知3D会说话人脸合成

Pose-Aware 3D Talking Face Synthesis Using Geometry-Guided Audio-Vertices Attention.

作者信息

Li Bo, Wei Xiaolin, Liu Bin, He Zhifen, Cao Junjie, Lai Yu-Kun

出版信息

IEEE Trans Vis Comput Graph. 2025 Mar;31(3):1758-1771. doi: 10.1109/TVCG.2024.3371064. Epub 2025 Jan 30.

DOI:10.1109/TVCG.2024.3371064
PMID:38416616
Abstract

Most of the existing 3D talking face synthesis methods suffer from the lack of detailed facial expressions and realistic head poses, resulting in unsatisfactory experiences for users. In this article, we propose a novel pose-aware 3D talking face synthesis method with a novel geometry-guided audio-vertices attention. To capture more detailed expression, such as the subtle nuances of mouth shape and eye movement, we propose to build hierarchical audio features including a global attribute feature and a series of vertex-wise local latent movement features. Then, in order to fully exploit the topology of facial models, we further propose a novel geometry-guided audio-vertices attention module to predict the displacement of each vertex by using vertex connectivity relations to take full advantage of the corresponding hierarchical audio features. Finally, to accomplish pose-aware animation, we expand the existing database with an additional pose attribute, and a novel pose estimation module is proposed by paying attention to the whole head model. Numerical experiments demonstrate the effectiveness of the proposed method on realistic expression and head movements against state-of-the-art methods.

摘要

现有的大多数3D会说话人脸合成方法都存在面部表情细节不足和头部姿势不逼真的问题,给用户带来不尽如人意的体验。在本文中,我们提出了一种新颖的具有姿势感知的3D会说话人脸合成方法,该方法采用了一种新颖的几何引导音频顶点注意力机制。为了捕捉更详细的表情,如嘴型和眼球运动的细微差别,我们建议构建分层音频特征,包括全局属性特征和一系列逐顶点的局部潜在运动特征。然后,为了充分利用面部模型的拓扑结构,我们进一步提出了一种新颖的几何引导音频顶点注意力模块,通过利用顶点连接关系来预测每个顶点的位移,从而充分利用相应的分层音频特征。最后,为了实现姿势感知动画,我们用一个额外的姿势属性扩展了现有数据库,并通过关注整个头部模型提出了一种新颖的姿势估计模块。数值实验证明了该方法在逼真表情和头部运动方面相对于现有方法的有效性。

相似文献

1
Pose-Aware 3D Talking Face Synthesis Using Geometry-Guided Audio-Vertices Attention.基于几何引导的音频顶点注意力的姿态感知3D会说话人脸合成
IEEE Trans Vis Comput Graph. 2025 Mar;31(3):1758-1771. doi: 10.1109/TVCG.2024.3371064. Epub 2025 Jan 30.
2
3D Talking Face With Personalized Pose Dynamics.具有个性化姿态动态的3D会说话面部。
IEEE Trans Vis Comput Graph. 2023 Feb;29(2):1438-1449. doi: 10.1109/TVCG.2021.3117484. Epub 2022 Dec 29.
3
Geometry-Guided Dense Perspective Network for Speech-Driven Facial Animation.基于几何引导的稠密透视网络的语音驱动人脸动画。
IEEE Trans Vis Comput Graph. 2022 Dec;28(12):4873-4886. doi: 10.1109/TVCG.2021.3107669. Epub 2022 Oct 26.
4
StyleTalk++: A Unified Framework for Controlling the Speaking Styles of Talking Heads.StyleTalk++:用于控制说话人头的说话风格的统一框架。
IEEE Trans Pattern Anal Mach Intell. 2024 Jun;46(6):4331-4347. doi: 10.1109/TPAMI.2024.3357808. Epub 2024 May 7.
5
Talking Face Generation With Audio-Deduced Emotional Landmarks.基于音频提取的情感地标进行人脸对话生成。
IEEE Trans Neural Netw Learn Syst. 2024 Oct;35(10):14099-14111. doi: 10.1109/TNNLS.2023.3274676. Epub 2024 Oct 7.
6
Real Time 3D Facial Movement Tracking Using a Monocular Camera.使用单目相机的实时3D面部运动跟踪
Sensors (Basel). 2016 Jul 25;16(8):1157. doi: 10.3390/s16081157.
7
Learn2Talk: 3D Talking Face Learns from 2D Talking Face.Learn2Talk:从二维会说话的面部学习三维会说话的面部。
IEEE Trans Vis Comput Graph. 2024 Oct 7;PP. doi: 10.1109/TVCG.2024.3476275.
8
Face-from-Depth for Head Pose Estimation on Depth Images.基于深度图像的人脸朝向估计
IEEE Trans Pattern Anal Mach Intell. 2020 Mar;42(3):596-609. doi: 10.1109/TPAMI.2018.2885472. Epub 2018 Dec 7.
9
Reconstructing 3D Face Model with Associated Expression Deformation from a Single Face Image via Constructing a Low-Dimensional Expression Deformation Manifold.基于构建低维表情变形流形的单张人脸图像关联表情变形的 3D 人脸模型重建。
IEEE Trans Pattern Anal Mach Intell. 2011 Oct;33(10):2115-21. doi: 10.1109/TPAMI.2011.88. Epub 2011 May 12.
10
Establishing point correspondence of 3D faces via sparse facial deformable model.通过稀疏人脸变形模型建立 3D 人脸的点对应关系。
IEEE Trans Image Process. 2013 Nov;22(11):4170-81. doi: 10.1109/TIP.2013.2271115. Epub 2013 Jun 26.

引用本文的文献

1
Continuous Talking Face Generation Based on Gaussian Blur and Dynamic Convolution.基于高斯模糊和动态卷积的连续说话人脸生成
Sensors (Basel). 2025 Mar 18;25(6):1885. doi: 10.3390/s25061885.