Huang Buzhen, Zhang Tianshu, Wang Yangang
IEEE Trans Pattern Anal Mach Intell. 2023 Apr;45(4):5010-5026. doi: 10.1109/TPAMI.2022.3199449. Epub 2023 Mar 7.
Occlusions between human and objects, especially for the activities of human-object interactions, are very common in practical applications. However, most of the existing approaches for 3D human shape and pose estimation require that human bodies are well captured without occlusions or with minor self-occlusions. In this paper, we focus on the problem of directly estimating the object-occluded human shape and pose from single color images. Our key idea is to utilize a partial UV map to represent an object-occluded human body, and the full 3D human shape estimation is ultimately converted as an image inpainting problem. We propose a novel two-branch network architecture to train an end-to-end regressor via a latent distribution consistency, which also includes a novel visible feature sub-net to extract the human information from object-occluded color images. To supervise the network training, we further build a novel dataset named as 3DOH50K. Several experiments are conducted to reveal the effectiveness of the proposed method. Experimental results demonstrate that the proposed method achieves state-of-the-art compared with previous methods. The dataset and codes are publicly available at https://www.yangangwang.com/papers/ZHANG-OOH-2020-03.html.
人与物体之间的遮挡,特别是在人机交互活动中,在实际应用中非常常见。然而,现有的大多数三维人体形状和姿态估计方法都要求人体在无遮挡或仅有轻微自遮挡的情况下被良好捕捉。在本文中,我们专注于从单张彩色图像直接估计被物体遮挡的人体形状和姿态的问题。我们的关键思想是利用部分UV映射来表示被物体遮挡的人体,而完整的三维人体形状估计最终被转化为一个图像修复问题。我们提出了一种新颖的双分支网络架构,通过潜在分布一致性来训练一个端到端回归器,其中还包括一个新颖的可见特征子网,用于从被物体遮挡的彩色图像中提取人体信息。为了监督网络训练,我们进一步构建了一个名为3DOH50K的新颖数据集。进行了多项实验以揭示所提方法的有效性。实验结果表明,与先前方法相比,所提方法达到了当前最优水平。数据集和代码可在https://www.yangangwang.com/papers/ZHANG-OOH-2020-03.html上公开获取。