Li Mao, Jiang Kaiqi, Zhang Xinhua
Department of Computer Science, University of Illinois at Chicago Chicago, IL 60607.
Adv Neural Inf Process Syst. 2021;34:25824-25838.
Probability discrepancy measure is a fundamental construct for numerous machine learning models such as weakly supervised learning and generative modeling. However, most measures overlook the fact that the distributions are not the end-product of learning, but are the input of a downstream predictor. Therefore, it is important to warp the probability discrepancy measure towards the end tasks, and towards this goal, we propose a new bi-level optimization based approach so that the two distributions are compared not uniformly against the entire hypothesis space, but only with respect to the optimal predictor for the downstream end task. When applied to margin disparity discrepancy and contrastive domain discrepancy, our method significantly improves the performance in unsupervised domain adaptation, and enjoys a much more principled training process.
概率差异度量是许多机器学习模型(如弱监督学习和生成建模)的基本构建。然而,大多数度量忽略了这样一个事实,即分布不是学习的最终产物,而是下游预测器的输入。因此,将概率差异度量朝着最终任务进行调整很重要,为了实现这一目标,我们提出了一种基于双层优化的新方法,使得两个分布不是在整个假设空间上均匀比较,而是仅相对于下游最终任务的最优预测器进行比较。当应用于边际差异差异和对比域差异时,我们的方法显著提高了无监督域适应的性能,并且拥有更具原则性的训练过程。