IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):863-875. doi: 10.1109/TPAMI.2017.2703082. Epub 2017 May 12.
In this paper, we seek to improve deep neural networks by generalizing the pooling operations that play a central role in the current architectures. We pursue a careful exploration of approaches to allow pooling to learn and to adapt to complex and variable patterns. The two primary directions lie in: (1) learning a pooling function via (two strategies of) combining of max and average pooling, and (2) learning a pooling function in the form of a tree-structured fusion of pooling filters that are themselves learned. In our experiments every generalized pooling operation we explore improves performance when used in place of average or max pooling. We experimentally demonstrate that the proposed pooling operations provide a boost in invariance properties relative to conventional pooling and set the state of the art on several widely adopted benchmark datasets. These benefits come with only a light increase in computational overhead during training (ranging from additional 5 to 15 percent in time complexity) and a very modest increase in the number of model parameters (e.g., additional 1, 9, and 27 parameters for mixed, gated, and 2-level tree pooling operators, respectively). To gain more insights about our proposed pooling methods, we also visualize the learned pooling masks and the embeddings of the internal feature responses for different pooling operations. Our proposed pooling operations are easy to implement and can be applied within various deep neural network architectures.
在本文中,我们试图通过泛化在当前架构中起核心作用的池化操作来改进深度神经网络。我们仔细探索了允许池化学习和适应复杂和多变模式的方法。两个主要方向在于:(1) 通过(两种组合策略)组合最大池化和平均池化来学习池化函数,(2) 以学习的池化滤波器的树状融合的形式学习池化函数。在我们的实验中,探索的每一种广义池化操作都可以提高性能,替代平均池化或最大池化使用。我们通过实验证明,与传统池化相比,所提出的池化操作提供了对不变性特性的提升,并在几个广泛采用的基准数据集上设定了最新水平。这些好处仅在训练期间增加了轻微的计算开销(时间复杂度增加 5%到 15%),并且模型参数数量仅略有增加(例如,混合、门控和 2 级树状池化操作分别增加 1、9 和 27 个参数)。为了更深入地了解我们提出的池化方法,我们还可视化了学习到的池化掩模和不同池化操作的内部特征响应的嵌入。我们提出的池化操作易于实现,可以应用于各种深度神经网络架构中。