Tangkaratt Voot, Sasaki Hiroaki, Sugiyama Masashi
University of Tokyo, Bunkyo-ku, Tokyo, 113-033, Japan
Nara Institute of Science and Technology, Ikoma, Nara 630-0192, Japan, and RIKEN Center for Advanced Intelligence Project, Chuo-ku, Tokyo 103-0027, Japan
Neural Comput. 2017 Aug;29(8):2076-2122. doi: 10.1162/NECO_a_00986. Epub 2017 Jun 9.
A typical goal of linear-supervised dimension reduction is to find a low-dimensional subspace of the input space such that the projected input variables preserve maximal information about the output variables. The dependence-maximization approach solves the supervised dimension-reduction problem through maximizing a statistical dependence between projected input variables and output variables. A well-known statistical dependence measure is mutual information (MI), which is based on the Kullback-Leibler (KL) divergence. However, it is known that the KL divergence is sensitive to outliers. Quadratic MI (QMI) is a variant of MI based on the [Formula: see text] distance, which is more robust against outliers than the KL divergence, and a computationally efficient method to estimate QMI from data, least squares QMI (LSQMI), has been proposed recently. For these reasons, developing a supervised dimension-reduction method based on LSQMI seems promising. However, not QMI itself but the derivative of QMI is needed for subspace search in linear-supervised dimension reduction, and the derivative of an accurate QMI estimator is not necessarily a good estimator of the derivative of QMI. In this letter, we propose to directly estimate the derivative of QMI without estimating QMI itself. We show that the direct estimation of the derivative of QMI is more accurate than the derivative of the estimated QMI. Finally, we develop a linear-supervised dimension-reduction algorithm that efficiently uses the proposed derivative estimator and demonstrate through experiments that the proposed method is more robust against outliers than existing methods.
线性监督降维的一个典型目标是找到输入空间的一个低维子空间,使得投影后的输入变量保留关于输出变量的最大信息。依赖最大化方法通过最大化投影后的输入变量与输出变量之间的统计依赖性来解决监督降维问题。一种著名的统计依赖性度量是互信息(MI),它基于库尔贝克 - 莱布勒(KL)散度。然而,众所周知,KL散度对异常值敏感。二次互信息(QMI)是基于[公式:见原文]距离的互信息变体,它比KL散度对异常值更具鲁棒性,并且最近已经提出了一种从数据中估计QMI的计算高效方法,即最小二乘QMI(LSQMI)。基于这些原因,开发一种基于LSQMI的监督降维方法似乎很有前景。然而,在进行线性监督降维的子空间搜索时,需要的不是QMI本身而是QMI的导数,并且精确的QMI估计器的导数不一定是QMI导数的良好估计器。在这封信中,我们建议直接估计QMI的导数而无需估计QMI本身。我们表明,直接估计QMI的导数比估计的QMI的导数更准确。最后,我们开发了一种线性监督降维算法,该算法有效地使用了所提出的导数估计器,并通过实验证明所提出的方法比现有方法对异常值更具鲁棒性。