IEEE Trans Pattern Anal Mach Intell. 2017 Jan;39(1):102-114. doi: 10.1109/TPAMI.2016.2537337. Epub 2016 Mar 2.
This paper proposes a hierarchical clustering multi-task learning (HC-MTL) method for joint human action grouping and recognition. Specifically, we formulate the objective function into the group-wise least square loss regularized by low rank and sparsity with respect to two latent variables, model parameters and grouping information, for joint optimization. To handle this non-convex optimization, we decompose it into two sub-tasks, multi-task learning and task relatedness discovery. First, we convert this non-convex objective function into the convex formulation by fixing the latent grouping information. This new objective function focuses on multi-task learning by strengthening the shared-action relationship and action-specific feature learning. Second, we leverage the learned model parameters for the task relatedness measure and clustering. In this way, HC-MTL can attain both optimal action models and group discovery by alternating iteratively. The proposed method is validated on three kinds of challenging datasets, including six realistic action datasets (Hollywood2, YouTube, UCF Sports, UCF50, HMDB51 & UCF101), two constrained datasets (KTH & TJU), and two multi-view datasets (MV-TJU & IXMAS). The extensive experimental results show that: 1) HC-MTL can produce competing performances to the state of the arts for action recognition and grouping; 2) HC-MTL can overcome the difficulty in heuristic action grouping simply based on human knowledge; 3) HC-MTL can avoid the possible inconsistency between the subjective action grouping depending on human knowledge and objective action grouping based on the feature subspace distributions of multiple actions. Comparison with the popular clustered multi-task learning further reveals that the discovered latent relatedness by HC-MTL aids inducing the group-wise multi-task learning and boosts the performance. To the best of our knowledge, ours is the first work that breaks the assumption that all actions are either independent for individual learning or correlated for joint modeling and proposes HC-MTL for automated, joint action grouping and modeling.
本文提出了一种层次聚类多任务学习(HC-MTL)方法,用于联合人体动作分组和识别。具体来说,我们将目标函数形式化为两个潜在变量(模型参数和分组信息)的低秩和稀疏正则化的组间最小二乘损失,进行联合优化。为了处理这个非凸优化问题,我们将其分解为两个子任务,多任务学习和任务相关性发现。首先,我们通过固定潜在的分组信息,将这个非凸目标函数转化为凸函数。这个新的目标函数通过加强共享动作关系和动作特定特征学习,专注于多任务学习。其次,我们利用学习到的模型参数进行任务相关性度量和聚类。通过这种方式,HC-MTL 可以通过交替迭代来获得最佳的动作模型和分组发现。该方法在三个具有挑战性的数据集上进行了验证,包括六个现实的动作数据集(好莱坞 2 数据集、YouTube 数据集、UCF Sports 数据集、UCF50 数据集、HMDB51 数据集和 UCF101 数据集)、两个受限数据集(KTH 数据集和 TJU 数据集)和两个多视图数据集(MV-TJU 数据集和 IXMAS 数据集)。广泛的实验结果表明:1)HC-MTL 可以在动作识别和分组方面达到最先进的水平;2)HC-MTL 可以克服基于人类知识的启发式动作分组的困难;3)HC-MTL 可以避免基于多动作特征子空间分布的主观动作分组和基于客观动作分组之间可能的不一致性。与流行的聚类多任务学习的比较进一步表明,HC-MTL 发现的潜在相关性有助于诱导分组的多任务学习,并提高性能。据我们所知,这是第一个打破所有动作要么独立于个体学习,要么相关于联合建模的假设,并提出 HC-MTL 用于自动、联合动作分组和建模的工作。