Department of Mathematical Informatics, University of Tokyo, Bunkyo-ku, Tokyo 113-8656, Japan.
Neural Comput. 2013 Mar;25(3):725-58. doi: 10.1162/NECO_a_00407. Epub 2012 Dec 28.
The goal of sufficient dimension reduction in supervised learning is to find the low-dimensional subspace of input features that contains all of the information about the output values that the input features possess. In this letter, we propose a novel sufficient dimension-reduction method using a squared-loss variant of mutual information as a dependency measure. We apply a density-ratio estimator for approximating squared-loss mutual information that is formulated as a minimum contrast estimator on parametric or nonparametric models. Since cross-validation is available for choosing an appropriate model, our method does not require any prespecified structure on the underlying distributions. We elucidate the asymptotic bias of our estimator on parametric models and the asymptotic convergence rate on nonparametric models. The convergence analysis utilizes the uniform tail-bound of a U-process, and the convergence rate is characterized by the bracketing entropy of the model. We then develop a natural gradient algorithm on the Grassmann manifold for sufficient subspace search. The analytic formula of our estimator allows us to compute the gradient efficiently. Numerical experiments show that the proposed method compares favorably with existing dimension-reduction approaches on artificial and benchmark data sets.
在监督学习中,充分降维的目标是找到输入特征的低维子空间,该子空间包含输入特征所具有的关于输出值的所有信息。在这封信中,我们提出了一种新的充分降维方法,该方法使用平方损失互信息的变体作为依赖度量。我们应用密度比估计器来近似平方损失互信息,该估计器被公式化为参数或非参数模型上的最小对比度估计器。由于交叉验证可用于选择合适的模型,因此我们的方法不需要对基础分布进行任何预指定的结构。我们阐明了我们在参数模型上的估计器的渐近偏差和非参数模型上的渐近收敛速度。收敛性分析利用了 U 过程的一致尾部界,并且收敛速度由模型的覆盖熵来刻画。然后,我们在 Grassmann 流形上开发了一种用于充分子空间搜索的自然梯度算法。我们的估计器的解析公式允许我们有效地计算梯度。数值实验表明,与人工和基准数据集上的现有降维方法相比,该方法具有优势。