• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于互信息正则化的深度神经网络收敛行为

Convergence Behavior of DNNs with Mutual-Information-Based Regularization.

作者信息

Jónsson Hlynur, Cherubini Giovanni, Eleftheriou Evangelos

机构信息

IBM Research Zurich, 8803 Rüschlikon, Switzerland.

出版信息

Entropy (Basel). 2020 Jun 30;22(7):727. doi: 10.3390/e22070727.

DOI:10.3390/e22070727
PMID:33286499
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7517266/
Abstract

Information theory concepts are leveraged with the goal of better understanding and improving Deep Neural Networks (DNNs). The information plane of neural networks describes the behavior during training of the mutual information at various depths between input/output and hidden-layer variables. Previous analysis revealed that most of the training epochs are spent on compressing the input, in some networks where finiteness of the mutual information can be established. However, the estimation of mutual information is nontrivial for high-dimensional continuous random variables. Therefore, the computation of the mutual information for DNNs and its visualization on the information plane mostly focused on low-complexity fully connected networks. In fact, even the existence of the compression phase in complex DNNs has been questioned and viewed as an open problem. In this paper, we present the convergence of mutual information on the information plane for a high-dimensional VGG-16 Convolutional Neural Network (CNN) by resorting to Mutual Information Neural Estimation (MINE), thus confirming and extending the results obtained with low-dimensional fully connected networks. Furthermore, we demonstrate the benefits of regularizing a network, especially for a large number of training epochs, by adopting mutual information estimates as additional terms in the loss function characteristic of the network. Experimental results show that the regularization stabilizes the test accuracy and significantly reduces its variance.

摘要

信息论概念被用于更好地理解和改进深度神经网络(DNN)。神经网络的信息平面描述了在训练期间输入/输出与隐藏层变量之间不同深度处互信息的行为。先前的分析表明,在一些能够确定互信息有限性的网络中,大部分训练轮次都花在压缩输入上。然而,对于高维连续随机变量,互信息的估计并非易事。因此,DNN互信息的计算及其在信息平面上的可视化大多集中在低复杂度的全连接网络上。事实上,复杂DNN中压缩阶段的存在甚至也受到质疑,并被视为一个开放性问题。在本文中,我们借助互信息神经估计(MINE)方法,给出了高维VGG - 16卷积神经网络(CNN)在信息平面上互信息的收敛情况,从而证实并扩展了低维全连接网络所得到的结果。此外,我们通过将互信息估计作为网络损失函数的附加项,展示了对网络进行正则化的好处,特别是对于大量训练轮次而言。实验结果表明,正则化使测试准确率稳定,并显著降低了其方差。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4992/7517266/5d28b2473c63/entropy-22-00727-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4992/7517266/0ce6283aced1/entropy-22-00727-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4992/7517266/a753374ff75b/entropy-22-00727-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4992/7517266/bf8c87668d78/entropy-22-00727-g003a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4992/7517266/0186bbb209da/entropy-22-00727-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4992/7517266/0c14f00377c9/entropy-22-00727-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4992/7517266/5d28b2473c63/entropy-22-00727-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4992/7517266/0ce6283aced1/entropy-22-00727-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4992/7517266/a753374ff75b/entropy-22-00727-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4992/7517266/bf8c87668d78/entropy-22-00727-g003a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4992/7517266/0186bbb209da/entropy-22-00727-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4992/7517266/0c14f00377c9/entropy-22-00727-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4992/7517266/5d28b2473c63/entropy-22-00727-g007.jpg

相似文献

1
Convergence Behavior of DNNs with Mutual-Information-Based Regularization.基于互信息正则化的深度神经网络收敛行为
Entropy (Basel). 2020 Jun 30;22(7):727. doi: 10.3390/e22070727.
2
Examining the Causal Structures of Deep Neural Networks Using Information Theory.利用信息论研究深度神经网络的因果结构
Entropy (Basel). 2020 Dec 18;22(12):1429. doi: 10.3390/e22121429.
3
Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy.基于张量核和矩阵熵的深度卷积神经网络分析
Entropy (Basel). 2023 Jun 3;25(6):899. doi: 10.3390/e25060899.
4
Dissecting Deep Learning Networks-Visualizing Mutual Information.剖析深度学习网络——可视化互信息
Entropy (Basel). 2018 Oct 26;20(11):823. doi: 10.3390/e20110823.
5
Utilizing Information Bottleneck to Evaluate the Capability of Deep Neural Networks for Image Classification.利用信息瓶颈评估深度神经网络的图像分类能力。
Entropy (Basel). 2019 May 1;21(5):456. doi: 10.3390/e21050456.
6
A Geometric Perspective on Information Plane Analysis.
Entropy (Basel). 2021 Jun 3;23(6):711. doi: 10.3390/e23060711.
7
Big in Japan: Regularizing Networks for Solving Inverse Problems.在日本大获成功:用于解决逆问题的正则化网络
J Math Imaging Vis. 2020;62(3):445-455. doi: 10.1007/s10851-019-00911-1. Epub 2019 Oct 3.
8
Deep convolutional neural network and IoT technology for healthcare.用于医疗保健的深度卷积神经网络和物联网技术。
Digit Health. 2024 Jan 17;10:20552076231220123. doi: 10.1177/20552076231220123. eCollection 2024 Jan-Dec.
9
Information Flows of Diverse Autoencoders.不同自编码器的信息流。
Entropy (Basel). 2021 Jul 5;23(7):862. doi: 10.3390/e23070862.
10
Aligned deep neural network for integrative analysis with high-dimensional input.用于高维输入的综合分析的对齐深度神经网络。
J Biomed Inform. 2023 Aug;144:104434. doi: 10.1016/j.jbi.2023.104434. Epub 2023 Jun 28.

引用本文的文献

1
On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches.基于类信息瓶颈方法的神经网络拟合、压缩与泛化行为研究
Entropy (Basel). 2023 Jul 14;25(7):1063. doi: 10.3390/e25071063.
2
Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy.基于张量核和矩阵熵的深度卷积神经网络分析
Entropy (Basel). 2023 Jun 3;25(6):899. doi: 10.3390/e25060899.
3
Information Bottleneck Theory Based Exploration of Cascade Learning.基于信息瓶颈理论的级联学习探索

本文引用的文献

1
On Information Plane Analyses of Neural Network Classifiers-A Review.神经网络分类器的信息平面分析——综述
IEEE Trans Neural Netw Learn Syst. 2022 Dec;33(12):7039-7051. doi: 10.1109/TNNLS.2021.3089037. Epub 2022 Nov 30.
2
Understanding Convolutional Neural Networks With Information Theory: An Initial Exploration.基于信息论理解卷积神经网络:初步探索
IEEE Trans Neural Netw Learn Syst. 2021 Jan;32(1):435-442. doi: 10.1109/TNNLS.2020.2968509. Epub 2021 Jan 4.
3
Multivariate Extension of Matrix-Based Rényi's α-Order Entropy Functional.
Entropy (Basel). 2021 Oct 18;23(10):1360. doi: 10.3390/e23101360.
4
Information Bottleneck Analysis by a Conditional Mutual Information Bound.基于条件互信息界的信息瓶颈分析。
Entropy (Basel). 2021 Jul 29;23(8):974. doi: 10.3390/e23080974.
5
Information Bottleneck: Theory and Applications in Deep Learning.信息瓶颈:深度学习中的理论与应用
Entropy (Basel). 2020 Dec 14;22(12):1408. doi: 10.3390/e22121408.
基于矩阵的雷尼α阶熵泛函的多元扩展
IEEE Trans Pattern Anal Mach Intell. 2020 Nov;42(11):2960-2966. doi: 10.1109/TPAMI.2019.2932976. Epub 2019 Aug 5.
4
Learning Representations for Neural Network-Based Classification Using the Information Bottleneck Principle.使用信息瓶颈原理学习基于神经网络的分类表示。
IEEE Trans Pattern Anal Mach Intell. 2020 Sep;42(9):2225-2239. doi: 10.1109/TPAMI.2019.2909031. Epub 2019 Apr 2.
5
Information Dropout: Learning Optimal Representations Through Noisy Computation.信息丢失:通过噪声计算学习最优表示
IEEE Trans Pattern Anal Mach Intell. 2018 Dec;40(12):2897-2905. doi: 10.1109/TPAMI.2017.2784440. Epub 2018 Jan 10.