Feng Yunlong, Wu Qiang
Department of Mathematics and Statistics, State University of New York at Albany, Albany, NY 12222, U.S.A.
Department of Mathematical Sciences, Middle Tennessee State University, Murfreesboro, TN 37132, U.S.A.
Neural Comput. 2021 May 13;33(6):1656-1697. doi: 10.1162/neco_a_01384.
We develop in this letter a framework of empirical gain maximization (EGM) to address the robust regression problem where heavy-tailed noise or outliers may be present in the response variable. The idea of EGM is to approximate the density function of the noise distribution instead of approximating the truth function directly as usual. Unlike the classical maximum likelihood estimation that encourages equal importance of all observations and could be problematic in the presence of abnormal observations, EGM schemes can be interpreted from a minimum distance estimation viewpoint and allow the ignorance of those observations. Furthermore, we show that several well-known robust nonconvex regression paradigms, such as Tukey regression and truncated least square regression, can be reformulated into this new framework. We then develop a learning theory for EGM by means of which a unified analysis can be conducted for these well-established but not fully understood regression approaches. This new framework leads to a novel interpretation of existing bounded nonconvex loss functions. Within this new framework, the two seemingly irrelevant terminologies, the well-known Tukey's biweight loss for robust regression and the triweight kernel for nonparametric smoothing, are closely related. More precisely, we show that Tukey's biweight loss can be derived from the triweight kernel. Other frequently employed bounded nonconvex loss functions in machine learning, such as the truncated square loss, the Geman-McClure loss, and the exponential squared loss, can also be reformulated from certain smoothing kernels in statistics. In addition, the new framework enables us to devise new bounded nonconvex loss functions for robust learning.
在本信函中,我们提出了一个经验增益最大化(EGM)框架,以解决响应变量中可能存在重尾噪声或异常值的稳健回归问题。EGM的思想是近似噪声分布的密度函数,而不是像通常那样直接近似真值函数。与鼓励所有观测值具有同等重要性且在存在异常观测值时可能存在问题的经典最大似然估计不同,EGM方案可以从最小距离估计的角度进行解释,并允许忽略那些观测值。此外,我们表明,几种著名的稳健非凸回归范式,如图基回归和截断最小二乘回归,可以重新表述为这个新框架。然后,我们为EGM发展了一种学习理论,通过该理论可以对这些已确立但尚未完全理解的回归方法进行统一分析。这个新框架对现有的有界非凸损失函数给出了一种新颖的解释。在这个新框架内,两个看似不相关的术语,即著名的用于稳健回归的图基双权损失和用于非参数平滑的三权核,密切相关。更确切地说,我们表明图基双权损失可以从三权核推导出来。机器学习中其他常用的有界非凸损失函数,如截断平方损失、杰曼 - 麦克卢尔损失和指数平方损失,也可以从统计学中的某些平滑核重新表述而来。此外,新框架使我们能够设计用于稳健学习的新的有界非凸损失函数。