Geiger Bernhard C, Fischer Ian S
Know-Center GmbH, Inffeldgasse 13/6, 8010 Graz, Austria.
Google Research, Mountain View, CA 94043, USA.
Entropy (Basel). 2020 Oct 29;22(11):1229. doi: 10.3390/e22111229.
In this short note, we relate the variational bounds proposed in Alemi et al. (2017) and Fischer (2020) for the information bottleneck (IB) and the conditional entropy bottleneck (CEB) functional, respectively. Although the two functionals were shown to be equivalent, it was empirically observed that optimizing bounds on the CEB functional achieves better generalization performance and adversarial robustness than optimizing those on the IB functional. This work tries to shed light on this issue by showing that, in the most general setting, no ordering can be established between these variational bounds, while such an ordering can be enforced by restricting the feasible sets over which the optimizations take place. The absence of such an ordering in the general setup suggests that the variational bound on the CEB functional is either more amenable to optimization or a relevant cost function for optimization in its own regard, i.e., without justification from the IB or CEB functionals.
在本短文里,我们分别关联了阿莱米等人(2017年)和菲舍尔(2020年)针对信息瓶颈(IB)和条件熵瓶颈(CEB)泛函所提出的变分界。尽管已证明这两个泛函是等价的,但从经验上观察到,优化CEB泛函的界比优化IB泛函的界能实现更好的泛化性能和对抗鲁棒性。这项工作试图通过表明在最一般的设定下,无法在这些变分界之间建立排序关系,而通过限制进行优化的可行集可以强制建立这样的排序关系,来阐明这个问题。在一般设置中不存在这样的排序关系,这表明CEB泛函的变分界要么更易于优化,要么就其本身而言是一个用于优化的相关成本函数,即无需从IB或CEB泛函进行论证。