IEEE Trans Pattern Anal Mach Intell. 2018 Mar;40(3):667-681. doi: 10.1109/TPAMI.2017.2691768. Epub 2017 Apr 6.
Recognizing human actions from unknown and unseen (novel) views is a challenging problem. We propose a Robust Non-Linear Knowledge Transfer Model (R-NKTM) for human action recognition from novel views. The proposed R-NKTM is a deep fully-connected neural network that transfers knowledge of human actions from any unknown view to a shared high-level virtual view by finding a set of non-linear transformations that connects the views. The R-NKTM is learned from 2D projections of dense trajectories of synthetic 3D human models fitted to real motion capture data and generalizes to real videos of human actions. The strength of our technique is that we learn a single R-NKTM for all actions and all viewpoints for knowledge transfer of any real human action video without the need for re-training or fine-tuning the model. Thus, R-NKTM can efficiently scale to incorporate new action classes. R-NKTM is learned with dummy labels and does not require knowledge of the camera viewpoint at any stage. Experiments on three benchmark cross-view human action datasets show that our method outperforms existing state-of-the-art.
从未知和未见(新颖)视角识别人类动作是一个具有挑战性的问题。我们提出了一种用于新颖视角下人类动作识别的鲁棒非线性知识迁移模型(R-NKTM)。所提出的 R-NKTM 是一个深度全连接神经网络,通过找到一组将视图连接起来的非线性变换,将人类动作的知识从任何未知视图转移到共享的高级虚拟视图。R-NKTM 是从拟合真实运动捕捉数据的合成 3D 人体模型的密集轨迹的 2D 投影中学习的,并推广到真实的人类动作视频。我们的技术的优势在于,我们为所有动作和所有视点学习单个 R-NKTM,以便在不需要重新训练或微调模型的情况下,将任何真实人类动作视频的知识转移。因此,R-NKTM 可以有效地扩展以纳入新的动作类别。R-NKTM 是使用虚拟标签学习的,在任何阶段都不需要知道摄像机视点的知识。在三个基准跨视图人类动作数据集上的实验表明,我们的方法优于现有的最先进技术。