Department of Mathematics, University of Bergen, Norway.
Neural Netw. 2022 Jan;145:164-176. doi: 10.1016/j.neunet.2021.10.014. Epub 2021 Oct 23.
The Delta method is a classical procedure for quantifying epistemic uncertainty in statistical models, but its direct application to deep neural networks is prevented by the large number of parameters P. We propose a low cost approximation of the Delta method applicable to L-regularized deep neural networks based on the top K eigenpairs of the Fisher information matrix. We address efficient computation of full-rank approximate eigendecompositions in terms of the exact inverse Hessian, the inverse outer-products of gradients approximation and the so-called Sandwich estimator. Moreover, we provide bounds on the approximation error for the uncertainty of the predictive class probabilities. We show that when the smallest computed eigenvalue of the Fisher information matrix is near the L-regularization rate, the approximation error will be close to zero even when K≪P. A demonstration of the methodology is presented using a TensorFlow implementation, and we show that meaningful rankings of images based on predictive uncertainty can be obtained for two LeNet and ResNet-based neural networks using the MNIST and CIFAR-10 datasets. Further, we observe that false positives have on average a higher predictive epistemic uncertainty than true positives. This suggests that there is supplementing information in the uncertainty measure not captured by the classification alone.
Delta 方法是一种用于量化统计模型中认知不确定性的经典方法,但由于参数 P 的数量很大,直接将其应用于深度神经网络是不可行的。我们提出了一种基于 Fisher 信息矩阵的 top K 特征对的低成本近似 Delta 方法,适用于 L-正则化深度神经网络。我们解决了基于精确逆 Hessian、梯度逼近的逆外积和所谓的 Sandwich 估计的全秩近似特征分解的有效计算问题。此外,我们还提供了预测类概率不确定性的逼近误差的界。我们表明,当 Fisher 信息矩阵的最小计算特征值接近 L-正则化率时,即使 K≪P,逼近误差也将接近零。我们使用 TensorFlow 实现演示了该方法,并表明可以使用 MNIST 和 CIFAR-10 数据集,基于预测不确定性对基于 LeNet 和 ResNet 的两个神经网络的图像进行有意义的排序。此外,我们观察到假阳性的预测认知不确定性平均高于真阳性。这表明不确定性度量中包含了仅通过分类无法捕捉到的补充信息。