Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge CB3 0WA, United Kingdom.
Department of Mathematics, University of Oslo, 0316 Oslo, Norway.
Proc Natl Acad Sci U S A. 2022 Mar 22;119(12):e2107151119. doi: 10.1073/pnas.2107151119. Epub 2022 Mar 16.
Deep learning (DL) has had unprecedented success and is now entering scientific computing with full force. However, current DL methods typically suffer from instability, even when universal approximation properties guarantee the existence of stable neural networks (NNs). We address this paradox by demonstrating basic well-conditioned problems in scientific computing where one can prove the existence of NNs with great approximation qualities; however, there does not exist any algorithm, even randomized, that can train (or compute) such a NN. For any positive integers K>2 and L, there are cases where simultaneously 1) no randomized training algorithm can compute a NN correct to K digits with probability greater than 1/2; 2) there exists a deterministic training algorithm that computes a NN with K –1 correct digits, but any such (even randomized) algorithm needs arbitrarily many training data; and 3) there exists a deterministic training algorithm that computes a NN with K –2 correct digits using no more than L training samples. These results imply a classification theory describing conditions under which (stable) NNs with a given accuracy can be computed by an algorithm. We begin this theory by establishing sufficient conditions for the existence of algorithms that compute stable NNs in inverse problems. We introduce fast iterative restarted networks (FIRENETs), which we both prove and numerically verify are stable. Moreover, we prove that only O(|log (ϵ)|) layers are needed for an ϵ-accurate solution to the inverse problem.
深度学习(DL)取得了空前的成功,现在正全力进入科学计算领域。然而,目前的 DL 方法通常存在不稳定性,即使通用逼近性质保证了稳定神经网络(NN)的存在。我们通过证明科学计算中基本条件良好的问题来解决这一悖论,在这些问题中,可以证明存在具有优异逼近质量的 NN;然而,不存在任何算法,即使是随机算法,也可以训练(或计算)这样的 NN。对于任意正整数 K>2 和 L,存在这样的情况:1)没有随机训练算法可以以大于 1/2 的概率计算出正确到 K 位的 NN;2)存在确定性训练算法,可以计算出具有 K-1 位正确数字的 NN,但任何这样的(甚至是随机的)算法都需要任意多的训练数据;3)存在确定性训练算法,它可以使用不超过 L 个训练样本计算出具有 K-2 位正确数字的 NN。这些结果意味着存在一种分类理论,可以描述在给定精度下可以通过算法计算(稳定)NN 的条件。我们通过建立在反问题中计算稳定 NN 的算法存在的充分条件来开始这一理论。我们引入了快速迭代重启网络(FIRENETs),我们不仅证明了它们的稳定性,还通过数值验证进行了验证。此外,我们证明了对于逆问题的 ε-精确解,只需要 O(|log(ϵ)|) 层。