• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用信息论研究深度神经网络的因果结构

Examining the Causal Structures of Deep Neural Networks Using Information Theory.

作者信息

Marrow Scythia, Michaud Eric J, Hoel Erik

机构信息

Allen Discovery Center, Tufts University, Medford, MA 02155, USA.

Department of Mathematics, University of California Berkeley, Berkeley, CA 94720, USA.

出版信息

Entropy (Basel). 2020 Dec 18;22(12):1429. doi: 10.3390/e22121429.

DOI:10.3390/e22121429
PMID:33353094
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7766755/
Abstract

Deep Neural Networks (DNNs) are often examined at the level of their response to input, such as analyzing the mutual information between nodes and data sets. Yet DNNs can also be examined at the level of causation, exploring "what does what" within the layers of the network itself. Historically, analyzing the causal structure of DNNs has received less attention than understanding their responses to input. Yet definitionally, generalizability must be a function of a DNN's causal structure as it reflects how the DNN responds to unseen or even not-yet-defined future inputs. Here, we introduce a suite of metrics based on information theory to quantify and track changes in the causal structure of DNNs during training. Specifically, we introduce the () of a feedforward DNN, which is the mutual information between layer input and output following a maximum-entropy perturbation. The can be used to assess the degree of causal influence nodes and edges have over their downstream targets in each layer. We show that the can be further decomposed in order to examine the sensitivity of a layer (measured by how well edges transmit perturbations) and the degeneracy of a layer (measured by how edge overlap interferes with transmission), along with estimates of the amount of integrated information of a layer. Together, these properties define where each layer lies in the "causal plane", which can be used to visualize how layer connectivity becomes more sensitive or degenerate over time, and how integration changes during training, revealing how the layer-by-layer causal structure differentiates. These results may help in understanding the generalization capabilities of DNNs and provide foundational tools for making DNNs both more generalizable and more explainable.

摘要

深度神经网络(DNN)通常在其对输入的响应层面进行研究,例如分析节点与数据集之间的互信息。然而,DNN也可以在因果关系层面进行研究,探索网络自身各层内部的“什么导致了什么”。从历史上看,分析DNN的因果结构比理解它们对输入的响应受到的关注更少。然而,从定义上讲,通用性必须是DNN因果结构的一个函数,因为它反映了DNN对未见过甚至尚未定义的未来输入的响应方式。在这里,我们引入了一套基于信息论的指标,用于量化和跟踪训练期间DNN因果结构的变化。具体来说,我们引入了前馈DNN的(),它是最大熵扰动后层输入与输出之间的互信息。该()可用于评估各层中节点和边对其下游目标的因果影响程度。我们表明,该()可以进一步分解,以便检查一层的敏感性(通过边传输扰动的程度来衡量)和一层的简并性(通过边的重叠对传输的干扰程度来衡量),以及一层的整合信息量估计。这些属性共同定义了每一层在“因果平面”中的位置,可用于可视化层连接性如何随时间变得更敏感或更简并,以及训练期间整合如何变化,揭示逐层因果结构是如何分化的。这些结果可能有助于理解DNN的泛化能力,并为使DNN更具泛化性和更具可解释性提供基础工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9682/7766755/df98fd797abb/entropy-22-01429-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9682/7766755/b7eef14e3987/entropy-22-01429-g0A1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9682/7766755/68637bfd37c1/entropy-22-01429-g0A2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9682/7766755/5cbeb5adede2/entropy-22-01429-g0A3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9682/7766755/dc4b977f2278/entropy-22-01429-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9682/7766755/a151aed9ed08/entropy-22-01429-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9682/7766755/d1cb79a21833/entropy-22-01429-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9682/7766755/eeec4828950c/entropy-22-01429-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9682/7766755/df98fd797abb/entropy-22-01429-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9682/7766755/b7eef14e3987/entropy-22-01429-g0A1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9682/7766755/68637bfd37c1/entropy-22-01429-g0A2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9682/7766755/5cbeb5adede2/entropy-22-01429-g0A3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9682/7766755/dc4b977f2278/entropy-22-01429-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9682/7766755/a151aed9ed08/entropy-22-01429-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9682/7766755/d1cb79a21833/entropy-22-01429-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9682/7766755/eeec4828950c/entropy-22-01429-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9682/7766755/df98fd797abb/entropy-22-01429-g005.jpg

相似文献

1
Examining the Causal Structures of Deep Neural Networks Using Information Theory.利用信息论研究深度神经网络的因果结构
Entropy (Basel). 2020 Dec 18;22(12):1429. doi: 10.3390/e22121429.
2
Convergence Behavior of DNNs with Mutual-Information-Based Regularization.基于互信息正则化的深度神经网络收敛行为
Entropy (Basel). 2020 Jun 30;22(7):727. doi: 10.3390/e22070727.
3
Information Entropy Measures for Evaluation of Reliability of Deep Neural Network Results.用于评估深度神经网络结果可靠性的信息熵度量
Entropy (Basel). 2023 Mar 27;25(4):573. doi: 10.3390/e25040573.
4
Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy.基于张量核和矩阵熵的深度卷积神经网络分析
Entropy (Basel). 2023 Jun 3;25(6):899. doi: 10.3390/e25060899.
5
Deep Convolutional Neural Networks Outperform Feature-Based But Not Categorical Models in Explaining Object Similarity Judgments.在解释物体相似性判断方面,深度卷积神经网络的表现优于基于特征的模型,但不优于分类模型。
Front Psychol. 2017 Oct 9;8:1726. doi: 10.3389/fpsyg.2017.01726. eCollection 2017.
6
An Optimal Transport Analysis on Generalization in Deep Learning.深度学习中的泛化的最优传输分析。
IEEE Trans Neural Netw Learn Syst. 2023 Jun;34(6):2842-2853. doi: 10.1109/TNNLS.2021.3109942. Epub 2023 Jun 1.
7
Task-specific feature extraction and classification of fMRI volumes using a deep neural network initialized with a deep belief network: Evaluation using sensorimotor tasks.使用由深度信念网络初始化的深度神经网络对功能磁共振成像(fMRI)体积进行特定任务特征提取和分类:基于感觉运动任务的评估
Neuroimage. 2017 Jan 15;145(Pt B):314-328. doi: 10.1016/j.neuroimage.2016.04.003. Epub 2016 Apr 11.
8
Improving robustness of a deep learning-based lung-nodule classification model of CT images with respect to image noise.提高基于深度学习的 CT 图像肺结节分类模型对图像噪声鲁棒性。
Phys Med Biol. 2021 Dec 7;66(24). doi: 10.1088/1361-6560/ac3d16.
9
Autoencoder and restricted Boltzmann machine for transfer learning in functional magnetic resonance imaging task classification.用于功能磁共振成像任务分类中迁移学习的自动编码器和受限玻尔兹曼机
Heliyon. 2023 Jul 16;9(7):e18086. doi: 10.1016/j.heliyon.2023.e18086. eCollection 2023 Jul.
10
Critical Path-Based Backdoor Detection for Deep Neural Networks.基于关键路径的深度神经网络后门检测
IEEE Trans Neural Netw Learn Syst. 2024 Mar;35(3):4032-4046. doi: 10.1109/TNNLS.2022.3201586. Epub 2024 Feb 29.

引用本文的文献

1
Finding emergence in data by maximizing effective information.通过最大化有效信息在数据中发现涌现现象。
Natl Sci Rev. 2024 Aug 12;12(1):nwae279. doi: 10.1093/nsr/nwae279. eCollection 2025 Jan.
2
Emergence and Causality in Complex Systems: A Survey of Causal Emergence and Related Quantitative Studies.复杂系统中的涌现与因果关系:因果涌现及相关定量研究综述
Entropy (Basel). 2024 Jan 24;26(2):108. doi: 10.3390/e26020108.

本文引用的文献

1
What Caused What? A Quantitative Account of Actual Causation Using Dynamical Causal Networks.是什么导致了什么?使用动态因果网络对实际因果关系的定量描述。
Entropy (Basel). 2019 May 2;21(5):459. doi: 10.3390/e21050459.
2
Measuring Integrated Information: Comparison of Candidate Measures in Theory and Simulation.测量整合信息:理论与模拟中候选测量方法的比较
Entropy (Basel). 2018 Dec 25;21(1):17. doi: 10.3390/e21010017.
3
One neuron versus deep learning in aftershock prediction.用于余震预测的单个神经元与深度学习方法对比
Nature. 2019 Oct;574(7776):E1-E3. doi: 10.1038/s41586-019-1582-8. Epub 2019 Oct 2.
4
Understanding autoencoders with information theoretic concepts.理解基于信息论概念的自动编码器。
Neural Netw. 2019 Sep;117:104-123. doi: 10.1016/j.neunet.2019.05.003. Epub 2019 May 15.
5
Learning Representations for Neural Network-Based Classification Using the Information Bottleneck Principle.使用信息瓶颈原理学习基于神经网络的分类表示。
IEEE Trans Pattern Anal Mach Intell. 2020 Sep;42(9):2225-2239. doi: 10.1109/TPAMI.2019.2909031. Epub 2019 Apr 2.
6
Can the macro beat the micro? Integrated information across spatiotemporal scales.宏观能战胜微观吗?跨时空尺度的整合信息。
Neurosci Conscious. 2016 Aug 31;2016(1):niw012. doi: 10.1093/nc/niw012. eCollection 2016.
7
How causal analysis can reveal autonomy in models of biological systems.因果分析如何揭示生物系统模型中的自主性。
Philos Trans A Math Phys Eng Sci. 2017 Dec 28;375(2109). doi: 10.1098/rsta.2016.0358.
8
Unified framework for information integration based on information geometry.基于信息几何的信息集成统一框架。
Proc Natl Acad Sci U S A. 2016 Dec 20;113(51):14817-14822. doi: 10.1073/pnas.1603583113. Epub 2016 Dec 6.
9
Improved Measures of Integrated Information.综合信息的改进措施。
PLoS Comput Biol. 2016 Nov 21;12(11):e1005123. doi: 10.1371/journal.pcbi.1005123. eCollection 2016 Nov.
10
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning.用于计算机辅助检测的深度卷积神经网络:卷积神经网络架构、数据集特征与迁移学习
IEEE Trans Med Imaging. 2016 May;35(5):1285-98. doi: 10.1109/TMI.2016.2528162. Epub 2016 Feb 11.