Ravi Sathya N, Venkatesh Abhay, Fung Glenn M, Singh Vikas
University of Illinois at Chicago.
University of Wisconsin-Madison.
Proc AAAI Conf Artif Intell. 2020 Jun 16;34(4):5487-5494. doi: 10.1609/aaai.v34i04.5999.
Data dependent regularization is known to benefit a wide variety of problems in machine learning. Often, these regularizers cannot be easily decomposed into a sum over a finite number of terms, e.g., a sum over individual example-wise terms. The measure, Area under the ROC curve (AUCROC) and Precision at a fixed recall (P@R) are some prominent examples that are used in many applications. We find that for most medium to large sized datasets, scalability issues severely limit our ability in leveraging the benefits of such regularizers. Importantly, the key technical impediment despite some recent progress is that, such objectives remain difficult to optimize via backpropapagation procedures. While an efficient general-purpose strategy for this problem still remains elusive, in this paper, we show that for many data-dependent nondecomposable regularizers that are relevant in applications, sizable gains in efficiency are possible with minimal code-level changes; in other words, no specialized tools or numerical schemes are needed. Our procedure involves a reparameterization followed by a partial dualization - this leads to a formulation that has provably cheap projection operators. We present a detailed analysis of runtime and convergence properties of our algorithm. On the experimental side, we show that a direct use of our scheme significantly improves the state of the art IOU measures reported for MSCOCO Stuff segmentation dataset.
数据依赖正则化已知有利于机器学习中的各种问题。通常,这些正则化器不能轻易分解为有限数量项的总和,例如,单个示例项的总和。曲线下面积(AUCROC)和固定召回率下的精确率(P@R)等度量是许多应用中使用的一些突出示例。我们发现,对于大多数中型到大型数据集,可扩展性问题严重限制了我们利用此类正则化器优势的能力。重要的是,尽管最近有一些进展,但关键的技术障碍在于,此类目标仍然难以通过反向传播过程进行优化。虽然针对这个问题的高效通用策略仍然难以捉摸,但在本文中,我们表明,对于许多在应用中相关的数据依赖不可分解正则化器,只需进行最少的代码级更改就可以在效率上取得显著提升;换句话说,不需要专门的工具或数值方案。我们的过程包括重新参数化,然后进行部分对偶化——这导致了一种具有可证明廉价投影算子的公式。我们对算法的运行时和收敛特性进行了详细分析。在实验方面,我们表明直接使用我们的方案显著提高了MSCOCO Stuff分割数据集报告的当前最佳交并比(IOU)度量。