Jawla Devesh, Kelleher John
School of Computer Science, Technological University Dublin, Dublin, Ireland.
ADAPT Research Centre, School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland.
Front Artif Intell. 2025 Apr 25;8:1444891. doi: 10.3389/frai.2025.1444891. eCollection 2025.
Previous work has demonstrated that initialization is very important for both fitting a neural network by gradient descent methods, as well as for Variational inference of Bayesian neural networks. In this work we investigate how perform with Markov Chain Monte Carlo trained Bayesian neural networks. From our experiments on 8 classifications datasets of various complexity, the results indicate that using makes the sampling process more efficient as compared to using an Isotropic Gaussian Prior, an Isotropic Cauchy Prior, or an Isotropic Laplace Prior. We also show that the cold posterior effect does not arise when using a either an Isotropic Gaussian or a layer wise Scaled Prior for small feed forward Bayesian neural networks. Since Bayesian neural networks are becoming popular due to their advantages such as uncertainty estimation, and prevention of over-fitting, this work seeks to provide improvements in the efficiency of Bayesian neural networks learned using Markov Chain Monte Carlo methods.
先前的工作已经表明,初始化对于通过梯度下降方法拟合神经网络以及贝叶斯神经网络的变分推理都非常重要。在这项工作中,我们研究了马尔可夫链蒙特卡罗训练的贝叶斯神经网络的性能。从我们对8个不同复杂度的分类数据集的实验结果来看,与使用各向同性高斯先验、各向同性柯西先验或各向同性拉普拉斯先验相比,使用[此处原文缺失具体内容]能使采样过程更高效。我们还表明,对于小型前馈贝叶斯神经网络,当使用各向同性高斯或逐层缩放先验时,不会出现冷后验效应。由于贝叶斯神经网络因其不确定性估计和防止过拟合等优点而越来越受欢迎,这项工作旨在提高使用马尔可夫链蒙特卡罗方法学习的贝叶斯神经网络的效率。