• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多任务野外头部姿态估计。

Multi-Task Head Pose Estimation in-the-Wild.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2021 Aug;43(8):2874-2881. doi: 10.1109/TPAMI.2020.3046323. Epub 2021 Jul 1.

DOI:10.1109/TPAMI.2020.3046323
PMID:33351746
Abstract

We present a deep learning-based multi-task approach for head pose estimation in images. We contribute with a network architecture and training strategy that harness the strong dependencies among face pose, alignment and visibility, to produce a top performing model for all three tasks. Our architecture is an encoder-decoder CNN with residual blocks and lateral skip connections. We show that the combination of head pose estimation and landmark-based face alignment significantly improve the performance of the former task. Further, the location of the pose task at the bottleneck layer, at the end of the encoder, and that of tasks depending on spatial information, such as visibility and alignment, in the final decoder layer, also contribute to increase the final performance. In the experiments conducted the proposed model outperforms the state-of-the-art in the face pose and visibility tasks. By including a final landmark regression step it also produces face alignment results on par with the state-of-the-art.

摘要

我们提出了一种基于深度学习的多任务方法,用于图像中的头部姿势估计。我们贡献了一种网络架构和训练策略,利用面部姿势、对齐和可见性之间的强依赖关系,为所有三个任务生成一个性能卓越的模型。我们的架构是一个具有残差块和横向跳跃连接的编码器-解码器 CNN。我们表明,头部姿势估计和基于地标点的面部对齐的结合显著提高了前者的性能。此外,将姿势任务的位置放在瓶颈层(编码器的末端),以及将依赖于空间信息(如可见性和对齐)的任务放在最终解码器层,也有助于提高最终性能。在进行的实验中,所提出的模型在面部姿势和可见性任务中优于最先进的方法。通过包括最终的地标回归步骤,它还产生了与最先进方法相当的面部对齐结果。

相似文献

1
Multi-Task Head Pose Estimation in-the-Wild.多任务野外头部姿态估计。
IEEE Trans Pattern Anal Mach Intell. 2021 Aug;43(8):2874-2881. doi: 10.1109/TPAMI.2020.3046323. Epub 2021 Jul 1.
2
Towards bi-directional skip connections in encoder-decoder architectures and beyond.迈向编码器-解码器架构及其他架构中的双向跳跃连接。
Med Image Anal. 2022 May;78:102420. doi: 10.1016/j.media.2022.102420. Epub 2022 Mar 16.
3
A multiple-channel and atrous convolution network for ultrasound image segmentation.一种用于超声图像分割的多通道多孔卷积网络。
Med Phys. 2020 Dec;47(12):6270-6285. doi: 10.1002/mp.14512. Epub 2020 Oct 18.
4
FDRN: A fast deformable registration network for medical images.FDRN:用于医学图像的快速可变形配准网络。
Med Phys. 2021 Oct;48(10):6453-6463. doi: 10.1002/mp.15011. Epub 2021 Jul 6.
5
Detection, segmentation, and 3D pose estimation of surgical tools using convolutional neural networks and algebraic geometry.使用卷积神经网络和代数几何进行手术工具的检测、分割和三维姿态估计。
Med Image Anal. 2021 May;70:101994. doi: 10.1016/j.media.2021.101994. Epub 2021 Feb 7.
6
A novel M-SegNet with global attention CNN architecture for automatic segmentation of brain MRI.一种新颖的基于全局注意力 CNN 架构的 M-SegNet,用于自动分割脑 MRI。
Comput Biol Med. 2021 Sep;136:104761. doi: 10.1016/j.compbiomed.2021.104761. Epub 2021 Aug 13.
7
HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition.超人脸:一个用于人脸检测、地标定位、姿势估计和性别识别的深度多任务学习框架。
IEEE Trans Pattern Anal Mach Intell. 2019 Jan;41(1):121-135. doi: 10.1109/TPAMI.2017.2781233. Epub 2017 Dec 8.
8
Multi-Path U-Net Architecture for Cell and Colony-Forming Unit Image Segmentation.多路径 U-Net 架构用于细胞和集落形成单位图像分割。
Sensors (Basel). 2022 Jan 27;22(3):990. doi: 10.3390/s22030990.
9
Multi-Task Convolutional Neural Network for Pose-Invariant Face Recognition.多任务卷积神经网络的姿态不变人脸识别。
IEEE Trans Image Process. 2018 Feb;27(2):964-975. doi: 10.1109/TIP.2017.2765830.
10
TypeSeg: A type-aware encoder-decoder network for multi-type ultrasound images co-segmentation.TypeSeg:一种用于多类型超声图像共分割的类型感知编解码器网络。
Comput Methods Programs Biomed. 2022 Feb;214:106580. doi: 10.1016/j.cmpb.2021.106580. Epub 2021 Dec 17.

引用本文的文献

1
Real-Time Driver Attention Detection in Complex Driving Environments via Binocular Depth Compensation and Multi-Source Temporal Bidirectional Long Short-Term Memory Network.通过双目深度补偿和多源时间双向长短期记忆网络在复杂驾驶环境中进行实时驾驶员注意力检测
Sensors (Basel). 2025 Sep 5;25(17):5548. doi: 10.3390/s25175548.
2
Automated Cattle Head and Ear Pose Estimation Using Deep Learning for Animal Welfare Research.利用深度学习进行牛头和耳部姿态自动估计用于动物福利研究
Vet Sci. 2025 Jul 13;12(7):664. doi: 10.3390/vetsci12070664.
3
Heatmap-Guided Selective Feature Attention for Robust Cascaded Face Alignment.
基于热力图引导的选择性特征关注的鲁棒级联人脸对齐。
Sensors (Basel). 2023 May 13;23(10):4731. doi: 10.3390/s23104731.
4
An Integrated Framework for Multi-State Driver Monitoring Using Heterogeneous Loss and Attention-Based Feature Decoupling.基于异构损失和基于注意力的特征解耦的多状态驾驶员监控综合框架。
Sensors (Basel). 2022 Sep 29;22(19):7415. doi: 10.3390/s22197415.
5
An Improved Tiered Head Pose Estimation Network with Self-Adjust Loss Function.一种具有自调整损失函数的改进型分层头部姿态估计网络。
Entropy (Basel). 2022 Jul 14;24(7):974. doi: 10.3390/e24070974.