• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

回顾用于理解深度学习的学习功能和语义信息度量的演变

Reviewing Evolution of Learning Functions and Semantic Information Measures for Understanding Deep Learning.

作者信息

Lu Chenguang

机构信息

Intelligence Engineering and Mathematics Institute, Liaoning Technical University, Fuxin 123000, China.

School of Computer Engineering and Applied Mathematics, Changsha University, Changsha 410022, China.

出版信息

Entropy (Basel). 2023 May 15;25(5):802. doi: 10.3390/e25050802.

DOI:10.3390/e25050802
PMID:37238557
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10217299/
Abstract

A new trend in deep learning, represented by Mutual Information Neural Estimation (MINE) and Information Noise Contrast Estimation (InfoNCE), is emerging. In this trend, similarity functions and Estimated Mutual Information (EMI) are used as learning and objective functions. Coincidentally, EMI is essentially the same as Semantic Mutual Information (SeMI) proposed by the author 30 years ago. This paper first reviews the evolutionary histories of semantic information measures and learning functions. Then, it briefly introduces the author's semantic information G theory with the rate-fidelity function () ( denotes SeMI, and () extends ()) and its applications to multi-label learning, the maximum Mutual Information (MI) classification, and mixture models. Then it discusses how we should understand the relationship between SeMI and Shannon's MI, two generalized entropies (fuzzy entropy and coverage entropy), Autoencoders, Gibbs distributions, and partition functions from the perspective of the () function or the G theory. An important conclusion is that mixture models and Restricted Boltzmann Machines converge because SeMI is maximized, and Shannon's MI is minimized, making information efficiency / close to 1. A potential opportunity is to simplify deep learning by using Gaussian channel mixture models for pre-training deep neural networks' latent layers without considering gradients. It also discusses how the SeMI measure is used as the reward function (reflecting purposiveness) for reinforcement learning. The G theory helps interpret deep learning but is far from enough. Combining semantic information theory and deep learning will accelerate their development.

摘要

一种以互信息神经估计(MINE)和信息噪声对比估计(InfoNCE)为代表的深度学习新趋势正在兴起。在这一趋势中,相似性函数和估计互信息(EMI)被用作学习和目标函数。巧合的是,EMI本质上与作者30年前提出的语义互信息(SeMI)相同。本文首先回顾了语义信息度量和学习函数的发展历程。然后,简要介绍了作者的语义信息G理论及其速率保真函数()(表示SeMI,()扩展了())及其在多标签学习、最大互信息(MI)分类和混合模型中的应用。接着讨论了如何从()函数或G理论的角度理解SeMI与香农互信息、两种广义熵(模糊熵和覆盖熵)、自动编码器、吉布斯分布以及配分函数之间的关系。一个重要结论是,混合模型和受限玻尔兹曼机收敛是因为SeMI最大化,而香农互信息最小化,使得信息效率/接近1。一个潜在的机会是通过使用高斯信道混合模型对深度神经网络的潜在层进行预训练来简化深度学习,而无需考虑梯度。还讨论了SeMI度量如何用作强化学习的奖励函数(反映目的性)。G理论有助于解释深度学习,但还远远不够。将语义信息理论与深度学习相结合将加速它们的发展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fe/10217299/079001f04468/entropy-25-00802-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fe/10217299/02511dc9d521/entropy-25-00802-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fe/10217299/d764c98adb79/entropy-25-00802-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fe/10217299/7e483b9dfbf2/entropy-25-00802-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fe/10217299/f9f92dfab1df/entropy-25-00802-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fe/10217299/d90e555ca6fa/entropy-25-00802-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fe/10217299/113ad7222e41/entropy-25-00802-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fe/10217299/67dfd7ac49db/entropy-25-00802-g007a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fe/10217299/91064bd26067/entropy-25-00802-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fe/10217299/079001f04468/entropy-25-00802-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fe/10217299/02511dc9d521/entropy-25-00802-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fe/10217299/d764c98adb79/entropy-25-00802-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fe/10217299/7e483b9dfbf2/entropy-25-00802-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fe/10217299/f9f92dfab1df/entropy-25-00802-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fe/10217299/d90e555ca6fa/entropy-25-00802-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fe/10217299/113ad7222e41/entropy-25-00802-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fe/10217299/67dfd7ac49db/entropy-25-00802-g007a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fe/10217299/91064bd26067/entropy-25-00802-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fe/10217299/079001f04468/entropy-25-00802-g009.jpg

相似文献

1
Reviewing Evolution of Learning Functions and Semantic Information Measures for Understanding Deep Learning.回顾用于理解深度学习的学习功能和语义信息度量的演变
Entropy (Basel). 2023 May 15;25(5):802. doi: 10.3390/e25050802.
2
Using the Semantic Information G Measure to Explain and Extend Rate-Distortion Functions and Maximum Entropy Distributions.
Entropy (Basel). 2021 Aug 15;23(8):1050. doi: 10.3390/e23081050.
3
Semantic and Generalized Entropy Loss Functions for Semi-Supervised Deep Learning.用于半监督深度学习的语义和广义熵损失函数。
Entropy (Basel). 2020 Mar 14;22(3):334. doi: 10.3390/e22030334.
4
Asymptotic Normality for Plug-In Estimators of Generalized Shannon's Entropy.广义香农熵插件估计量的渐近正态性。
Entropy (Basel). 2022 May 12;24(5):683. doi: 10.3390/e24050683.
5
Measuring the usefulness of hidden units in Boltzmann machines with mutual information.用互信息衡量玻尔兹曼机中隐藏单元的有用性。
Neural Netw. 2015 Apr;64:12-8. doi: 10.1016/j.neunet.2014.09.004. Epub 2014 Sep 28.
6
Information Flows of Diverse Autoencoders.不同自编码器的信息流。
Entropy (Basel). 2021 Jul 5;23(7):862. doi: 10.3390/e23070862.
7
Enhanced mutual information neural estimators for optical fiber communication.
Opt Lett. 2024 Aug 1;49(15):4381-4384. doi: 10.1364/OL.534025.
8
Shannon's, mutual, conditional and joint entropy information indices: generalization of global indices defined from local vertex invariants.香农熵、互信息、条件熵和联合熵信息指数:基于局部顶点不变量定义的全局指数的推广。
Curr Comput Aided Drug Des. 2013 Jun;9(2):164-83. doi: 10.2174/1573409911309020003.
9
Generalizing Information to the Evolution of Rational Belief.将信息推广至合理信念的演变
Entropy (Basel). 2020 Jan 16;22(1):108. doi: 10.3390/e22010108.
10
Active semi-supervised learning method with hybrid deep belief networks.基于混合深度信念网络的主动半监督学习方法
PLoS One. 2014 Sep 10;9(9):e107122. doi: 10.1371/journal.pone.0107122. eCollection 2014.

引用本文的文献

1
A Semantic Generalization of Shannon's Information Theory and Applications.香农信息论的语义泛化及其应用
Entropy (Basel). 2025 Apr 24;27(5):461. doi: 10.3390/e27050461.
2
(HTBNet)Arbitrary Shape Scene Text Detection with Binarization of Hyperbolic Tangent and Cross-Entropy.(HTBNet)基于双曲正切二值化和交叉熵的任意形状场景文本检测
Entropy (Basel). 2024 Jun 29;26(7):560. doi: 10.3390/e26070560.

本文引用的文献

1
Causal Confirmation Measures: From Simpson's Paradox to COVID-19.因果关系确认方法:从辛普森悖论到新冠疫情
Entropy (Basel). 2023 Jan 10;25(1):143. doi: 10.3390/e25010143.
2
An Information Theoretic Interpretation to Deep Neural Networks.对深度神经网络的信息论解释
Entropy (Basel). 2022 Jan 17;24(1):135. doi: 10.3390/e24010135.
3
Kernel Estimation of Cumulative Residual Tsallis Entropy and Its Dynamic Version under -Mixing Dependent Data.混合相依数据下累积剩余Tsallis熵及其动态版本的核估计
Entropy (Basel). 2021 Dec 21;24(1):9. doi: 10.3390/e24010009.
4
Using the Semantic Information G Measure to Explain and Extend Rate-Distortion Functions and Maximum Entropy Distributions.
Entropy (Basel). 2021 Aug 15;23(8):1050. doi: 10.3390/e23081050.
5
Channels' Confirmation and Predictions' Confirmation: From the Medical Test to the Raven Paradox.通道的确证与预测的确证:从医学检验到乌鸦悖论
Entropy (Basel). 2020 Mar 26;22(4):384. doi: 10.3390/e22040384.
6
From evidence to understanding: a commentary on Fisher (1922) 'On the mathematical foundations of theoretical statistics'.从证据到理解:对费希尔(1922年)《论理论统计学的数学基础》的评论
Philos Trans A Math Phys Eng Sci. 2015 Apr 13;373(2039). doi: 10.1098/rsta.2014.0252.
7
Reducing the dimensionality of data with neural networks.使用神经网络降低数据维度。
Science. 2006 Jul 28;313(5786):504-7. doi: 10.1126/science.1127647.
8
A fast learning algorithm for deep belief nets.一种用于深度信念网络的快速学习算法。
Neural Comput. 2006 Jul;18(7):1527-54. doi: 10.1162/neco.2006.18.7.1527.
9
Stimulus and response generalization: tests of a model relating generalization to distance in psychological space.刺激与反应泛化:关于一种将泛化与心理空间中的距离相关联的模型的测试。
J Exp Psychol. 1958 Jun;55(6):509-23. doi: 10.1037/h0042354.