Akash Aditya Kumar, Lokhande Vishnu Suresh, Ravi Sathya N, Singh Vikas
University of Wisconsin-Madison.
University of Illinois at Chicago.
Proc AAAI Conf Artif Intell. 2021 Feb;35(8):6582-6591. Epub 2021 May 18.
Learning invariant representations is a critical first step in a number of machine learning tasks. A common approach corresponds to the so-called information bottleneck principle in which an application dependent function of mutual information is carefully chosen and optimized. Unfortunately, in practice, these functions are not suitable for optimization purposes since these losses are agnostic of the metric structure of the parameters of the model. We introduce a class of losses for learning representations that are invariant to some extraneous variable of interest by inverting the class of contrastive losses, i.e., inverse contrastive loss (ICL). We show that if the extraneous variable is binary, then optimizing ICL is equivalent to optimizing a regularized MMD divergence. More generally, we also show that if we are provided a metric on the sample space, our formulation of ICL can be decomposed into a sum of convex functions of the given distance metric. Our experimental results indicate that models obtained by optimizing ICL achieve significantly better invariance to the extraneous variable for a fixed desired level of accuracy. In a variety of experimental settings, we show applicability of ICL for learning invariant representations for both continuous and discrete extraneous variables. The project page with code is available at https://github.com/adityakumarakash/ICL.
学习不变表示是许多机器学习任务中的关键第一步。一种常见的方法对应于所谓的信息瓶颈原理,即仔细选择并优化互信息的应用相关函数。不幸的是,在实践中,这些函数不适合用于优化目的,因为这些损失与模型参数的度量结构无关。我们通过反转对比损失类,即逆对比损失(ICL),引入了一类用于学习对某些感兴趣的无关变量不变的表示的损失。我们表明,如果无关变量是二元的,那么优化ICL等同于优化正则化的最大均值差异(MMD)散度。更一般地,我们还表明,如果在样本空间上给定一个度量,我们对ICL的公式可以分解为给定距离度量的凸函数之和。我们的实验结果表明,通过优化ICL获得的模型在固定的期望精度水平下,对外在变量具有显著更好的不变性。在各种实验设置中,我们展示了ICL在学习连续和离散无关变量的不变表示方面的适用性。带有代码的项目页面可在https://github.com/adityakumarakash/ICL获取。