Liu Weiwei, Xu Donna, Tsang Ivor W, Zhang Wenjie
IEEE Trans Pattern Anal Mach Intell. 2019 Feb;41(2):408-422. doi: 10.1109/TPAMI.2018.2794976. Epub 2018 Jan 18.
Multi-output learning with the task of simultaneously predicting multiple outputs for an input has increasingly attracted interest from researchers due to its wide application. The k nearest neighbor ([Formula: see text]) algorithm is one of the most popular frameworks for handling multi-output problems. The performance of [Formula: see text] depends crucially on the metric used to compute the distance between different instances. However, our experiment results show that the existing advanced metric learning technique cannot provide an appropriate distance metric for multi-output tasks. This paper systematically studies how to efficiently learn an appropriate distance metric for multi-output problems with provable guarantee. In particular, we present a novel large margin metric learning paradigm for multi-output tasks, which projects both the input and output into the same embedding space and then learns a distance metric to discover output dependency such that instances with very different multiple outputs will be moved far away. Several strategies are then proposed to speed up the training and testing time. Moreover, we study the generalization error bound of our method for three learning tasks, which shows that our method converges to the optimal solutions. Experiments on three multi-output learning tasks (multi-label classification, multi-target regression, and multi-concept retrieval) validate the effectiveness and scalability of the proposed method.
由于其广泛的应用,用于为一个输入同时预测多个输出任务的多输出学习越来越受到研究人员的关注。k近邻(kNN)算法是处理多输出问题最流行的框架之一。kNN的性能关键取决于用于计算不同实例之间距离的度量。然而,我们的实验结果表明,现有的先进度量学习技术不能为多输出任务提供合适的距离度量。本文系统地研究了如何为多输出问题高效地学习一个具有可证明保证的合适距离度量。具体而言,我们提出了一种用于多输出任务的新颖的大间隔度量学习范式,该范式将输入和输出都投影到同一个嵌入空间,然后学习一个距离度量以发现输出依赖性,使得具有非常不同的多个输出的实例被移到远处。然后提出了几种策略来加快训练和测试时间。此外,我们研究了我们的方法在三个学习任务上的泛化误差界,这表明我们的方法收敛到最优解。在三个多输出学习任务(多标签分类、多目标回归和多概念检索)上的实验验证了所提方法的有效性和可扩展性。