Liu Tianci, Shi Zelin, Liu Yunpeng
Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China; Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110169, China; University of Chinese Academy of Sciences, Beijing 100049, China; Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang 110016, China; and Key Lab of Image Understanding and Computer Vision, Liaoning Province, Shenyang 110016, China
Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China; Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110169, China; Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang 110016, China; and Key Lab of Image Understanding and Computer Vision, Liaoning Province, Shenyang 110016, China
Neural Comput. 2019 Jan;31(1):156-175. doi: 10.1162/neco_a_01148. Epub 2018 Nov 21.
Modeling videos and image sets by linear subspaces has achieved great success in various visual recognition tasks. However, subspaces constructed from visual data are always notoriously embedded in a high-dimensional ambient space, which limits the applicability of existing techniques. This letter explores the possibility of proposing a geometry-aware framework for constructing lower-dimensional subspaces with maximum discriminative power from high-dimensional subspaces in the supervised scenario. In particular, we make use of Riemannian geometry and optimization techniques on matrix manifolds to learn an orthogonal projection, which shows that the learning process can be formulated as an unconstrained optimization problem on a Grassmann manifold. With this natural geometry, any metric on the Grassmann manifold can theoretically be used in our model. Experimental evaluations on several data sets show that our approach results in significantly higher accuracy than other state-of-the-art algorithms.
通过线性子空间对视频和图像集进行建模在各种视觉识别任务中取得了巨大成功。然而,从视觉数据构建的子空间总是深陷于高维环境空间中,这限制了现有技术的适用性。本文探讨了在监督场景下,提出一种几何感知框架的可能性,该框架用于从高维子空间构建具有最大判别力的低维子空间。具体而言,我们利用黎曼几何和矩阵流形上的优化技术来学习正交投影,这表明学习过程可以被表述为格拉斯曼流形上的无约束优化问题。基于这种自然几何,格拉斯曼流形上的任何度量理论上都可用于我们的模型。在多个数据集上的实验评估表明,我们的方法比其他现有最先进算法具有显著更高的准确率。