基于 dropout 的贝叶斯神经网络的不确定性传播。

Uncertainty propagation for dropout-based Bayesian neural networks.

机构信息

DENSO CORPORATION, 1-1, Showa-cho, Kariya, Aichi, 448-8661, Japan.

Graduate School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8654, Japan; Center for Advanced Intelligence Project, RIKEN, Nihonbashi 1-chome Mitsui Building, 15th floor,1-4-1 Nihonbashi, Chuo-ku, Tokyo, 103-0027, Japan.

出版信息

Neural Netw. 2021 Dec;144:394-406. doi: 10.1016/j.neunet.2021.09.005. Epub 2021 Sep 9.

DOI:10.1016/j.neunet.2021.09.005

PMID:34562813

Abstract

Uncertainty evaluation is a core technique when deep neural networks (DNNs) are used in real-world problems. In practical applications, we often encounter unexpected samples that have not seen in the training process. Not only achieving the high-prediction accuracy but also detecting uncertain data is significant for safety-critical systems. In statistics and machine learning, Bayesian inference has been exploited for uncertainty evaluation. The Bayesian neural networks (BNNs) have recently attracted considerable attention in this context, as the DNN trained using dropout is interpreted as a Bayesian method. Based on this interpretation, several methods to calculate the Bayes predictive distribution for DNNs have been developed. Though the Monte-Carlo method called MC dropout is a popular method for uncertainty evaluation, it requires a number of repeated feed-forward calculations of DNNs with randomly sampled weight parameters. To overcome the computational issue, we propose a sampling-free method to evaluate uncertainty. Our method converts a neural network trained using dropout to the corresponding Bayesian neural network with variance propagation. Our method is available not only to feed-forward NNs but also to recurrent NNs such as LSTM. We report the computational efficiency and statistical reliability of our method in numerical experiments of language modeling using RNNs, and the out-of-distribution detection with DNNs.

摘要

不确定性评估是将深度神经网络 (DNN) 应用于实际问题的核心技术。在实际应用中，我们经常会遇到训练过程中未见过的意外样本。对于安全关键系统来说，不仅要实现高预测精度，还要检测不确定数据是很重要的。在统计学和机器学习中，贝叶斯推断被用于不确定性评估。贝叶斯神经网络 (BNN) 在这种情况下最近引起了相当多的关注，因为使用 dropout 训练的 DNN 被解释为一种贝叶斯方法。基于这种解释，已经开发了几种用于计算 DNN 的贝叶斯预测分布的方法。虽然称为 MC dropout 的蒙特卡罗方法是一种用于不确定性评估的流行方法，但它需要对具有随机采样权重参数的 DNN 进行多次重复前馈计算。为了克服计算问题，我们提出了一种无抽样方法来评估不确定性。我们的方法将使用 dropout 训练的神经网络转换为具有方差传播的相应贝叶斯神经网络。我们的方法不仅适用于前馈神经网络，也适用于递归神经网络，如 LSTM。我们在使用 RNN 的语言建模的数值实验中报告了我们方法的计算效率和统计可靠性，以及使用 DNN 进行分布外检测。