Suppr超能文献

理解深度神经网络中聚合层的分布

Understanding the Distributions of Aggregation Layers in Deep Neural Networks.

作者信息

Ong Eng-Jon, Husain Sameed, Bober Miroslaw

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Apr;35(4):5536-5550. doi: 10.1109/TNNLS.2022.3207790. Epub 2024 Apr 4.

Abstract

The process of aggregation is ubiquitous in almost all the deep nets' models. It functions as an important mechanism for consolidating deep features into a more compact representation while increasing the robustness to overfitting and providing spatial invariance in deep nets. In particular, the proximity of global aggregation layers to the output layers of DNNs means that aggregated features directly influence the performance of a deep net. A better understanding of this relationship can be obtained using information theoretic methods. However, this requires knowledge of the distributions of the activations of aggregation layers. To achieve this, we propose a novel mathematical formulation for analytically modeling the probability distributions of output values of layers involved with deep feature aggregation. An important outcome is our ability to analytically predict the Kullback-Leibler (KL)-divergence of output nodes in a DNN. We also experimentally verify our theoretical predictions against empirical observations across a broad range of different classification tasks and datasets.

摘要

聚合过程在几乎所有深度网络模型中无处不在。它作为一种重要机制,将深度特征整合为更紧凑的表示形式,同时增强对过拟合的鲁棒性,并在深度网络中提供空间不变性。特别是,全局聚合层与深度神经网络输出层的接近意味着聚合特征直接影响深度网络的性能。使用信息论方法可以更好地理解这种关系。然而,这需要了解聚合层激活的分布情况。为了实现这一点,我们提出了一种新颖的数学公式,用于对涉及深度特征聚合的层的输出值概率分布进行分析建模。一个重要成果是我们能够分析预测深度神经网络中输出节点的库尔贝克-莱布勒(KL)散度。我们还通过实验针对广泛的不同分类任务和数据集的实证观察验证了我们的理论预测。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验