Suppr超能文献

深度级联学习

Deep Cascade Learning.

作者信息

Marquez Enrique S, Hare Jonathon S, Niranjan Mahesan

出版信息

IEEE Trans Neural Netw Learn Syst. 2018 Nov;29(11):5475-5485. doi: 10.1109/TNNLS.2018.2805098. Epub 2018 Mar 6.

Abstract

In this paper, we propose a novel approach for efficient training of deep neural networks in a bottom-up fashion using a layered structure. Our algorithm, which we refer to as deep cascade learning, is motivated by the cascade correlation approach of Fahlman and Lebiere, who introduced it in the context of perceptrons. We demonstrate our algorithm on networks of convolutional layers, though its applicability is more general. Such training of deep networks in a cascade directly circumvents the well-known vanishing gradient problem by ensuring that the output is always adjacent to the layer being trained. We present empirical evaluations comparing our deep cascade training with standard end-end training using back propagation of two convolutional neural network architectures on benchmark image classification tasks (CIFAR-10 and CIFAR-100). We then investigate the features learned by the approach and find that better, domain-specific, representations are learned in early layers when compared to what is learned in end-end training. This is partially attributable to the vanishing gradient problem that inhibits early layer filters to change significantly from their initial settings. While both networks perform similarly overall, recognition accuracy increases progressively with each added layer, with discriminative features learned in every stage of the network, whereas in end-end training, no such systematic feature representation was observed. We also show that such cascade training has significant computational and memory advantages over end-end training, and can be used as a pretraining algorithm to obtain a better performance.

摘要

在本文中,我们提出了一种新颖的方法,用于以自底向上的方式使用分层结构高效训练深度神经网络。我们的算法,即深度级联学习,其灵感来源于Fahlman和Lebiere的级联相关方法,他们在感知机的背景下引入了该方法。我们在卷积层网络上展示了我们的算法,不过其适用性更为广泛。以级联方式对深度网络进行这样的训练,通过确保输出始终与正在训练的层相邻,直接规避了众所周知的梯度消失问题。我们进行了实证评估,在基准图像分类任务(CIFAR - 10和CIFAR - 100)上,将我们的深度级联训练与使用反向传播的标准端到端训练对两种卷积神经网络架构进行比较。然后,我们研究了该方法学习到的特征,发现与端到端训练中学习到的特征相比,在早期层中学习到了更好的、特定领域的表示。这部分归因于梯度消失问题,该问题抑制了早期层滤波器从其初始设置发生显著变化。虽然两个网络总体表现相似,但识别准确率随着每增加一层而逐步提高,在网络的每个阶段都学习到了判别性特征,而在端到端训练中,未观察到这种系统的特征表示。我们还表明,这种级联训练相对于端到端训练具有显著的计算和内存优势,并且可以用作预训练算法以获得更好的性能。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验