Suppr超能文献

使用双反向传播提高泛化性能。

Improving generalization performance using double backpropagation.

作者信息

Drucker H, Le Cun Y

机构信息

ATandT Bell Lab., West Long Branch, NJ.

出版信息

IEEE Trans Neural Netw. 1992;3(6):991-7. doi: 10.1109/72.165600.

Abstract

In order to generalize from a training set to a test set, it is desirable that small changes in the input space of a pattern do not change the output components. This can be done by forcing this behavior as part of the training algorithm. This is done in double backpropagation by forming an energy function that is the sum of the normal energy term found in backpropagation and an additional term that is a function of the Jacobian. Significant improvement is shown with different architectures and different test sets, especially with architectures that had previously been shown to have very good performance when trained using backpropagation. It is shown that double backpropagation, as compared to backpropagation, creates weights that are smaller, thereby causing the output of the neurons to spend more time in the linear region.

摘要

为了从训练集推广到测试集,期望模式输入空间中的微小变化不会改变输出分量。这可以通过将此行为强制作为训练算法的一部分来实现。在双反向传播中,通过形成一个能量函数来做到这一点,该能量函数是反向传播中发现的正常能量项与作为雅可比矩阵函数的附加项之和。在不同的架构和不同的测试集上都显示出了显著的改进,特别是对于那些之前使用反向传播训练时表现出非常好性能的架构。结果表明,与反向传播相比,双反向传播创建的权重更小,从而使神经元的输出在线性区域花费更多时间。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验