深度动作捕捉：使用多个深度传感器和后向反射器的深度光学动作捕捉。

DeepMoCap: Deep Optical Motion Capture Using Multiple Depth Sensors and Retro-Reflectors.

机构信息

Centre for Research and Technology Hellas, Information Technologies Institute, 6th km Charilaou-Thermi, 57001 Thermi, Thessaloniki, Greece.

National Technical University of Athens, School of Electrical and Computer Engineering, Zografou Campus, Iroon Polytechniou 9, 15780 Zografou, Athens, Greece.

出版信息

Sensors (Basel). 2019 Jan 11;19(2):282. doi: 10.3390/s19020282.

DOI:10.3390/s19020282

PMID:30642017

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6359336/

Abstract

In this paper, a marker-based, single-person optical motion capture method (DeepMoCap) is proposed using multiple spatio-temporally aligned infrared-depth sensors and retro-reflective straps and patches (reflectors). DeepMoCap explores motion capture by automatically localizing and labeling reflectors on depth images and, subsequently, on 3D space. Introducing a non-parametric representation to encode the temporal correlation among pairs of colorized depthmaps and 3D optical flow frames, a multi-stage Fully Convolutional Network (FCN) architecture is proposed to jointly learn reflector locations and their temporal dependency among sequential frames. The extracted reflector 2D locations are spatially mapped in 3D space, resulting in robust 3D optical data extraction. The subject's motion is efficiently captured by applying a template-based fitting technique on the extracted optical data. Two datasets have been created and made publicly available for evaluation purposes; one comprising multi-view depth and 3D optical flow annotated images (DMC2.5D), and a second, consisting of spatio-temporally aligned multi-view depth images along with skeleton, inertial and ground truth MoCap data (DMC3D). The FCN model outperforms its competitors on the DMC2.5D dataset using 2D Percentage of Correct Keypoints (PCK) metric, while the motion capture outcome is evaluated against RGB-D and inertial data fusion approaches on DMC3D, outperforming the next best method by 4 . 5 % in total 3D PCK accuracy.

摘要

本文提出了一种基于标记的单人光学运动捕捉方法（DeepMoCap），该方法使用多个时空对齐的红外深度传感器和背反射带和贴片（反射器）。DeepMoCap 通过自动定位和标记深度图像和随后的 3D 空间中的反射器来探索运动捕捉。通过引入非参数表示来编码彩色深度图和 3D 光流帧之间的时间相关性，提出了一种多阶段全卷积网络（FCN）架构，以联合学习反射器位置及其在连续帧之间的时间依赖性。提取的反射器 2D 位置在 3D 空间中进行空间映射，从而实现稳健的 3D 光学数据提取。通过在提取的光学数据上应用基于模板的拟合技术，可以有效地捕捉主体的运动。创建了两个数据集并公开发布以供评估；一个包含多视图深度和 3D 光流注释图像（DMC2.5D），另一个包含时空对齐的多视图深度图像以及骨架、惯性和地面真实 MoCap 数据（DMC3D）。FCN 模型在 DMC2.5D 数据集上使用 2D 关键点正确百分比（PCK）指标优于其竞争对手，而在 DMC3D 上，运动捕捉结果与 RGB-D 和惯性数据融合方法进行评估，在总 3D PCK 精度方面比下一个最佳方法高出 4.5%。