Stock Michiel, Pahikkala Tapio, Airola Antti, De Baets Bernard, Waegeman Willem
KERMIT, Department of Data Analysis and Mathematical Modelling, Ghent University, 9000 Ghent, Belgium
Department of Future Technologies, University of Turku, 20520 Turku, Finland
Neural Comput. 2018 Aug;30(8):2245-2283. doi: 10.1162/neco_a_01096. Epub 2018 Jun 12.
Many machine learning problems can be formulated as predicting labels for a pair of objects. Problems of that kind are often referred to as pairwise learning, dyadic prediction, or network inference problems. During the past decade, kernel methods have played a dominant role in pairwise learning. They still obtain a state-of-the-art predictive performance, but a theoretical analysis of their behavior has been underexplored in the machine learning literature. In this work we review and unify kernel-based algorithms that are commonly used in different pairwise learning settings, ranging from matrix filtering to zero-shot learning. To this end, we focus on closed-form efficient instantiations of Kronecker kernel ridge regression. We show that independent task kernel ridge regression, two-step kernel ridge regression, and a linear matrix filter arise naturally as a special case of Kronecker kernel ridge regression, implying that all these methods implicitly minimize a squared loss. In addition, we analyze universality, consistency, and spectral filtering properties. Our theoretical results provide valuable insights into assessing the advantages and limitations of existing pairwise learning methods.
许多机器学习问题都可以表述为预测一对对象的标签。这类问题通常被称为成对学习、二元预测或网络推理问题。在过去十年中,核方法在成对学习中发挥了主导作用。它们仍然具有领先的预测性能,但机器学习文献中对其行为的理论分析却未得到充分探索。在这项工作中,我们回顾并统一了在不同成对学习场景中常用的基于核的算法,范围从矩阵滤波到零样本学习。为此,我们专注于克罗内克核岭回归的闭式高效实例化。我们表明,独立任务核岭回归、两步核岭回归和线性矩阵滤波器自然地作为克罗内克核岭回归的一种特殊情况出现,这意味着所有这些方法都隐含地最小化平方损失。此外,我们分析了通用性、一致性和谱滤波特性。我们的理论结果为评估现有成对学习方法的优缺点提供了有价值的见解。