Suppr超能文献

噪声增强卷积神经网络。

Noise-enhanced convolutional neural networks.

机构信息

Signal and Information Processing Institute, Electrical Engineering Department, University of Southern California, Los Angeles, CA, United States.

Signal and Information Processing Institute, Electrical Engineering Department, University of Southern California, Los Angeles, CA, United States.

出版信息

Neural Netw. 2016 Jun;78:15-23. doi: 10.1016/j.neunet.2015.09.014. Epub 2015 Oct 19.

Abstract

Injecting carefully chosen noise can speed convergence in the backpropagation training of a convolutional neural network (CNN). The Noisy CNN algorithm speeds training on average because the backpropagation algorithm is a special case of the generalized expectation-maximization (EM) algorithm and because such carefully chosen noise always speeds up the EM algorithm on average. The CNN framework gives a practical way to learn and recognize images because backpropagation scales with training data. It has only linear time complexity in the number of training samples. The Noisy CNN algorithm finds a special separating hyperplane in the network's noise space. The hyperplane arises from the likelihood-based positivity condition that noise-boosts the EM algorithm. The hyperplane cuts through a uniform-noise hypercube or Gaussian ball in the noise space depending on the type of noise used. Noise chosen from above the hyperplane speeds training on average. Noise chosen from below slows it on average. The algorithm can inject noise anywhere in the multilayered network. Adding noise to the output neurons reduced the average per-iteration training-set cross entropy by 39% on a standard MNIST image test set of handwritten digits. It also reduced the average per-iteration training-set classification error by 47%. Adding noise to the hidden layers can also reduce these performance measures. The noise benefit is most pronounced for smaller data sets because the largest EM hill-climbing gains tend to occur in the first few iterations. This noise effect can assist random sampling from large data sets because it allows a smaller random sample to give the same or better performance than a noiseless sample gives.

摘要

精心选择的噪声可以加速卷积神经网络(CNN)的反向传播训练的收敛。噪声 CNN 算法平均而言可以加速训练,因为反向传播算法是广义期望最大化(EM)算法的特例,而且这种精心选择的噪声通常会平均加速 EM 算法。CNN 框架为学习和识别图像提供了一种实用的方法,因为反向传播与训练数据的规模有关。它在训练样本数量上的时间复杂度仅为线性。噪声 CNN 算法在网络的噪声空间中找到了一个特殊的分离超平面。该超平面源于基于似然的正定性条件,即噪声增强了 EM 算法。超平面根据使用的噪声类型穿过噪声空间中的均匀噪声超立方体或高斯球。从超平面上方选择的噪声平均可以加速训练。从超平面下方选择的噪声平均会减缓训练。该算法可以在多层网络的任何位置注入噪声。在标准 MNIST 手写数字测试集上,将噪声添加到输出神经元可以将每次迭代的训练集交叉熵平均降低 39%,将每次迭代的训练集分类错误平均降低 47%。向隐藏层添加噪声也可以降低这些性能指标。由于最大的 EM 爬山增益往往发生在前几次迭代中,因此这种噪声效应在较小的数据集上最为明显。这种噪声效应可以帮助从大数据集进行随机抽样,因为它允许较小的随机样本获得与无噪声样本相同或更好的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验