一种简单、快速且高精度的算法，可从单张图像上的 2D 地标中恢复 3D 形状。

A Simple, Fast and Highly-Accurate Algorithm to Recover 3D Shape from 2D Landmarks on a Single Image.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2018 Dec;40(12):3059-3066. doi: 10.1109/TPAMI.2017.2772922. Epub 2017 Nov 13.

DOI:10.1109/TPAMI.2017.2772922

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6262843/

Abstract

Three-dimensional shape reconstruction of 2D landmark points on a single image is a hallmark of human vision, but is a task that has been proven difficult for computer vision algorithms. We define a feed-forward deep neural network algorithm that can reconstruct 3D shapes from 2D landmark points almost perfectly (i.e., with extremely small reconstruction errors), even when these 2D landmarks are from a single image. Our experimental results show an improvement of up to two-fold over state-of-the-art computer vision algorithms; 3D shape reconstruction error (measured as the Procrustes distance between the reconstructed shape and the ground-truth) of human faces is , cars is .0022, human bodies is .022, and highly-deformable flags is .0004. Our algorithm was also a top performer at the 2016 3D Face Alignment in the Wild Challenge competition (done in conjunction with the European Conference on Computer Vision, ECCV) that required the reconstruction of 3D face shape from a single image. The derived algorithm can be trained in a couple hours and testing runs at more than 1,000 frames/s on an i7 desktop. We also present an innovative data augmentation approach that allows us to train the system efficiently with small number of samples. And the system is robust to noise (e.g., imprecise landmark points) and missing data (e.g., occluded or undetected landmark points).

摘要

从单张图像上的二维特征点重建三维形状是人类视觉的标志性能力，但这一任务已被证明非常具有挑战性，即使对于计算机视觉算法来说也是如此。我们定义了一种前馈式深度神经网络算法，可以近乎完美地（即，重建误差极小）从二维特征点重建三维形状，即使这些二维特征点仅来自单张图像。我们的实验结果表明，与最先进的计算机视觉算法相比，该算法的性能提高了一倍以上；人脸、汽车、人体和高度可变形标志的三维形状重建误差（以重建形状与真实形状之间的 Procrustes 距离衡量）分别为、、和。在与欧洲计算机视觉会议（ECCV）同期举行的 2016 年野外 3D 人脸配准挑战赛中，我们的算法也取得了优异成绩，该挑战赛要求仅从单张图像重建三维人脸形状。该算法可在数小时内训练完成，在 i7 台式机上的测试速度超过 1000 帧/秒。我们还提出了一种创新的数据增强方法，使我们能够使用少量样本高效地训练系统。此外，该系统对噪声（例如，不精确的特征点）和缺失数据（例如，遮挡或未检测到的特征点）具有鲁棒性。

相似文献

A Simple, Fast and Highly-Accurate Algorithm to Recover 3D Shape from 2D Landmarks on a Single Image.一种简单、快速且高精度的算法，可从单张图像上的 2D 地标中恢复 3D 形状。

IEEE Trans Pattern Anal Mach Intell. 2018 Dec;40(12):3059-3066. doi: 10.1109/TPAMI.2017.2772922. Epub 2017 Nov 13.

Statistical shape model-based reconstruction of a scaled, patient-specific surface model of the pelvis from a single standard AP x-ray radiograph.基于统计形状模型的骨盆单张标准前后位 X 射线片重建患者个体化缩放表面模型。

Med Phys. 2010 Apr;37(4):1424-39. doi: 10.1118/1.3327453.

3D Reconstruction of "In-the-Wild" Faces in Images and Videos.“野外”人脸的图像和视频的三维重建。

IEEE Trans Pattern Anal Mach Intell. 2018 Nov;40(11):2638-2652. doi: 10.1109/TPAMI.2018.2832138. Epub 2018 May 15.

An Automatic 3D Facial Landmarking Algorithm Using 2D Gabor Wavelets.基于二维 Gabor 小波的自动三维人脸地标定位算法。

IEEE Trans Image Process. 2016 Feb;25(2):580-8. doi: 10.1109/TIP.2015.2496183. Epub 2015 Oct 29.

Joint Face Alignment and 3D Face Reconstruction with Application to Face Recognition.联合人脸对齐和 3D 人脸重建及其在人脸识别中的应用。

IEEE Trans Pattern Anal Mach Intell. 2020 Mar;42(3):664-678. doi: 10.1109/TPAMI.2018.2885995. Epub 2018 Dec 10.

Robust 3D face landmark localization based on local coordinate coding.基于局部坐标编码的鲁棒 3D 人脸地标定位。

IEEE Trans Image Process. 2014 Dec;23(12):5108-22. doi: 10.1109/TIP.2014.2361204. Epub 2014 Oct 2.

3D facial landmark detection under large yaw and expression variations.在大俯仰角和表情变化下的 3D 面部地标检测。

IEEE Trans Pattern Anal Mach Intell. 2013 Jul;35(7):1552-64. doi: 10.1109/TPAMI.2012.247.

Fully Automatic Landmarking of Syndromic 3D Facial Surface Scans Using 2D Images.基于 2D 图像的综合征 3D 面部表面扫描全自动标志定位

Sensors (Basel). 2020 Jun 3;20(11):3171. doi: 10.3390/s20113171.

Viewpoint-Consistent 3D Face Alignment.视角一致的三维人脸配准。

IEEE Trans Pattern Anal Mach Intell. 2018 Sep;40(9):2250-2264. doi: 10.1109/TPAMI.2017.2750687. Epub 2017 Sep 11.

Shape registration with learned deformations for 3D shape reconstruction from sparse and incomplete point clouds.基于学习变形的形状配准用于从稀疏和不完整点云进行三维形状重建。

Med Image Anal. 2021 Dec;74:102228. doi: 10.1016/j.media.2021.102228. Epub 2021 Sep 9.

引用本文的文献

Efficient inverse graphics in biological face processing.生物面部处理中的高效反向图形。

Sci Adv. 2020 Mar 4;6(10):eaax5979. doi: 10.1126/sciadv.aax5979. eCollection 2020 Mar.

The promises and perils of automated facial action coding in studying children's emotions.自动化面部动作编码在研究儿童情绪中的作用和风险。

Dev Psychol. 2019 Sep;55(9):1965-1981. doi: 10.1037/dev0000728.

本文引用的文献

Computational Models of Face Perception.面部感知的计算模型

Curr Dir Psychol Sci. 2017 Jun;26(3):263-269. doi: 10.1177/0963721417698535. Epub 2017 Jun 14.

Sparse Representation for 3D Shape Estimation: A Convex Relaxation Approach.基于稀疏表示的三维形状估计：一种凸松弛方法。

IEEE Trans Pattern Anal Mach Intell. 2017 Aug;39(8):1648-1661. doi: 10.1109/TPAMI.2016.2605097. Epub 2016 Sep 1.

Dense 3D Face Alignment from 2D Videos in Real-Time.实时从二维视频中进行密集三维人脸对齐

IEEE Int Conf Autom Face Gesture Recognit Workshops. 2015 May;1. doi: 10.1109/FG.2015.7163142.

Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments.Human3.6M：自然环境中 3D 人体感应的大规模数据集和预测方法。

IEEE Trans Pattern Anal Mach Intell. 2014 Jul;36(7):1325-39. doi: 10.1109/TPAMI.2013.248.

Kernel Non-Rigid Structure from Motion.基于运动的内核非刚性结构

Proc IEEE Int Conf Comput Vis. 2011:802-809. doi: 10.1109/ICCV.2011.6126319.

Learning Spatially-Smooth Mappings in Non-Rigid Structure from Motion.从运动中学习非刚性结构的空间平滑映射。

Comput Vis ECCV. 2012;7575:260-273. doi: 10.1007/978-3-642-33765-9_19.

Computing Smooth Time Trajectories for Camera and Deformable Shape in Structure from Motion with Occlusion.在存在遮挡的运动结构中计算相机和可变形形状的平滑时间轨迹

IEEE Trans Pattern Anal Mach Intell. 2011 Oct;33(10):2051-65. doi: 10.1109/TPAMI.2011.50. Epub 2011 Mar 10.

Trajectory Space: A Dual Representation for Nonrigid Structure from Motion.轨迹空间：运动非刚体结构的双重表示。

IEEE Trans Pattern Anal Mach Intell. 2011 Jul;33(7):1442-56. doi: 10.1109/TPAMI.2010.201. Epub 2010 Nov 18.

Features versus context: An approach for precise and detailed detection and delineation of faces and facial features.特征与背景：一种用于精确和详细检测与描绘人脸和面部特征的方法。

IEEE Trans Pattern Anal Mach Intell. 2010 Nov;32(11):2022-38. doi: 10.1109/TPAMI.2010.28.

Multi-PIE.多姿态、光照和表情数据库

Proc Int Conf Autom Face Gesture Recognit. 2010 May 1;28(5):807-813. doi: 10.1016/j.imavis.2009.08.002.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验