du Plessis Marthinus Christoffel, Shiino Hiroaki, Sugiyama Masashi
Department of Complexity Science and Engineering, University of Tokyo, Bunkyo-ku, Tokyo 113-0033, Japan.
Department of Computer Science, Tokyo Institute of Technology, Meguro-ku, Tokyo 152-8552, Japan.
Neural Comput. 2015 Sep;27(9):1899-914. doi: 10.1162/NECO_a_00761. Epub 2015 Jul 10.
Many machine learning problems, such as nonstationarity adaptation, outlier detection, dimensionality reduction, and conditional density estimation, can be effectively solved by using the ratio of probability densities. Since the naive two-step procedure of first estimating the probability densities and then taking their ratio performs poorly, methods to directly estimate the density ratio from two sets of samples without density estimation have been extensively studied recently. However, these methods are batch algorithms that use the whole data set to estimate the density ratio, and they are inefficient in the online setup, where training samples are provided sequentially and solutions are updated incrementally without storing previous samples. In this letter, we propose two online density-ratio estimators based on the adaptive regularization of weight vectors. Through experiments on inlier-based outlier detection, we demonstrate the usefulness of the proposed methods.
许多机器学习问题,如非平稳性适应、异常值检测、降维和条件密度估计等,都可以通过使用概率密度比有效地解决。由于先估计概率密度然后再求其比值的简单两步法效果不佳,因此最近人们广泛研究了无需密度估计即可直接从两组样本中估计密度比的方法。然而,这些方法是使用整个数据集来估计密度比的批处理算法,在在线设置中效率不高,因为在线设置中训练样本是按顺序提供的,并且在不存储先前样本的情况下逐步更新解决方案。在这封信中,我们基于权重向量的自适应正则化提出了两种在线密度比估计器。通过基于内点的异常值检测实验,我们证明了所提出方法的有效性。