深度神经网络具有内在的奥卡姆剃刀原理。

Deep neural networks have an inbuilt Occam's razor.

作者信息

Mingard Chris, Rees Henry, Valle-Pérez Guillermo, Louis Ard A

机构信息

Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford, UK.

Physical and Theoretical Chemistry Laboratory, University of Oxford, Oxford, UK.

出版信息

Nat Commun. 2025 Jan 14;16(1):220. doi: 10.1038/s41467-024-54813-x.

DOI:10.1038/s41467-024-54813-x

PMID:39809746

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11733143/

Abstract

The remarkable performance of overparameterized deep neural networks (DNNs) must arise from an interplay between network architecture, training algorithms, and structure in the data. To disentangle these three components for supervised learning, we apply a Bayesian picture based on the functions expressed by a DNN. The prior over functions is determined by the network architecture, which we vary by exploiting a transition between ordered and chaotic regimes. For Boolean function classification, we approximate the likelihood using the error spectrum of functions on data. Combining this with the prior yields an accurate prediction for the posterior, measured for DNNs trained with stochastic gradient descent. This analysis shows that structured data, together with a specific Occam's razor-like inductive bias towards (Kolmogorov) simple functions that exactly counteracts the exponential growth of the number of functions with complexity, is a key to the success of DNNs.

摘要

过参数化深度神经网络（DNN）的卓越性能必定源于网络架构、训练算法和数据结构之间的相互作用。为了在监督学习中解开这三个组成部分，我们基于DNN所表达的函数应用一种贝叶斯图景。函数的先验由网络架构决定，我们通过利用有序和混沌状态之间的转变来改变网络架构。对于布尔函数分类，我们使用数据上函数的误差谱来近似似然。将此与先验相结合，可对后验做出准确预测，该预测是针对使用随机梯度下降训练的DNN进行测量的。此分析表明，结构化数据，连同对（柯尔莫哥洛夫）简单函数的一种特定的类似奥卡姆剃刀的归纳偏差，这种偏差恰好抵消了函数数量随复杂度呈指数增长的情况，是DNN成功的关键。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1710/11733143/5d5c8e02de27/41467_2024_54813_Fig1_HTML.jpg

相似文献

Deep neural networks have an inbuilt Occam's razor.深度神经网络具有内在的奥卡姆剃刀原理。

Nat Commun. 2025 Jan 14;16(1):220. doi: 10.1038/s41467-024-54813-x.

Occam's Razor in sensorimotor learning.奥卡姆剃刀在感觉运动学习中的应用。

Proc Biol Sci. 2014 Mar 26;281(1783):20132952. doi: 10.1098/rspb.2013.2952. Print 2014 May 22.

Razor sharp: The role of Occam's razor in science.剃刀锋利：奥卡姆剃刀在科学中的作用。

Ann N Y Acad Sci. 2023 Dec;1530(1):8-17. doi: 10.1111/nyas.15086. Epub 2023 Nov 29.

Achieving Occam's razor: Deep learning for optimal model reduction.达到奥卡姆剃刀原则：深度学习用于最优模型约简。

PLoS Comput Biol. 2024 Jul 18;20(7):e1012283. doi: 10.1371/journal.pcbi.1012283. eCollection 2024 Jul.

Occam's razor and Hickam's dictum: a dermatologic perspective.奥卡姆剃刀和希卡姆定律：皮肤科视角。

Diagnosis (Berl). 2022 Nov 18;10(2):96-99. doi: 10.1515/dx-2022-0093. eCollection 2023 May 1.

Structure learning and the Occam's razor principle: a new view of human function acquisition.结构学习与奥卡姆剃刀原理：人类功能习得的新视角。

Front Comput Neurosci. 2014 Sep 30;8:121. doi: 10.3389/fncom.2014.00121. eCollection 2014.

Consistent Sparse Deep Learning: Theory and Computation.一致稀疏深度学习：理论与计算

J Am Stat Assoc. 2022;117(540):1981-1995. doi: 10.1080/01621459.2021.1895175. Epub 2021 Apr 20.

Discovering Neural Nets with Low Kolmogorov Complexity and High Generalization Capability.发现具有低柯尔莫哥洛夫复杂度和高泛化能力的神经网络。

Neural Netw. 1997 Jul;10(5):857-873. doi: 10.1016/s0893-6080(96)00127-x.

Improving Deep Neural Networks' Training for Image Classification With Nonlinear Conjugate Gradient-Style Adaptive Momentum.使用非线性共轭梯度风格的自适应动量改进深度神经网络的图像分类训练

IEEE Trans Neural Netw Learn Syst. 2024 Sep;35(9):12288-12300. doi: 10.1109/TNNLS.2023.3255783. Epub 2024 Sep 3.

Effects of the Training Data Condition on Arterial Spin Labeling Parameter Estimation Using a Simulation-Based Supervised Deep Neural Network.基于仿真的监督深度学习神经网络对动脉自旋标记参数估计的训练数据条件影响。

J Comput Assist Tomogr. 2024;48(3):459-471. doi: 10.1097/RCT.0000000000001566. Epub 2023 Nov 17.

本文引用的文献

Separation of scales and a thermodynamic description of feature learning in some CNNs.尺度分离和一些 CNN 中特征学习的热力学描述。

Nat Commun. 2023 Feb 17;14(1):908. doi: 10.1038/s41467-023-36361-y.

Biological convolutions improve DNN robustness to noise and generalisation.生物褶皱提高了 DNN 对噪声和泛化的鲁棒性。

Neural Netw. 2022 Apr;148:96-110. doi: 10.1016/j.neunet.2021.12.005. Epub 2021 Dec 17.

Spectral bias and task-model alignment explain generalization in kernel regression and infinitely wide neural networks.谱偏差和任务模型对齐解释了核回归和无限宽神经网络中的泛化。

Nat Commun. 2021 May 18;12(1):2914. doi: 10.1038/s41467-021-23103-1.

Anomalous collapses of Nares Strait ice arches leads to enhanced export of Arctic sea ice.纳雷斯海峡冰拱异常崩塌导致北极海冰加速外流。

Nat Commun. 2021 Jan 4;12(1):1. doi: 10.1038/s41467-020-20314-w.

A high-bias, low-variance introduction to Machine Learning for physicists.面向物理学家的机器学习高偏差、低方差入门介绍。

Phys Rep. 2019 May 30;810:1-124. doi: 10.1016/j.physrep.2019.03.001. Epub 2019 Mar 14.

Reconciling modern machine-learning practice and the classical bias-variance trade-off.调和现代机器学习实践与经典偏差-方差权衡。

Proc Natl Acad Sci U S A. 2019 Aug 6;116(32):15849-15854. doi: 10.1073/pnas.1903070116. Epub 2019 Jul 24.

Opportunities and obstacles for deep learning in biology and medicine.深度学习在生物学和医学中的机遇与挑战。

J R Soc Interface. 2018 Apr;15(141). doi: 10.1098/rsif.2017.0387.

Input-output maps are strongly biased towards simple outputs.输入-输出映射强烈偏向于简单输出。

Nat Commun. 2018 Feb 22;9(1):761. doi: 10.1038/s41467-018-03101-6.

Structural absorption by barbule microstructures of super black bird of paradise feathers.超级天堂鸟羽毛羽小枝微结构的结构吸收

Nat Commun. 2018 Jan 9;9(1):1. doi: 10.1038/s41467-017-02088-w.

Deep learning.深度学习。

Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

深度神经网络具有内在的奥卡姆剃刀原理。

Deep neural networks have an inbuilt Occam's razor.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献