Maji Partha, Mullins Robert
Department of Computer Science and Technology, University of Cambridge, William Gates Building, 15 JJ Thomson Avenue, Cambridge CB3 0FD, UK.
Entropy (Basel). 2018 Apr 23;20(4):305. doi: 10.3390/e20040305.
Deep convolutional neural networks (ConvNets), which are at the heart of many new emerging applications, achieve remarkable performance in audio and visual recognition tasks. Unfortunately, achieving accuracy often implies significant computational costs, limiting deployability. In modern ConvNets it is typical for the convolution layers to consume the vast majority of computational resources during inference. This has made the acceleration of these layers an important research area in academia and industry. In this paper, we examine the effects of co-optimizing the internal structures of the convolutional layers and underlying implementation of fundamental convolution operation. We demonstrate that a combination of these methods can have a big impact on the overall speedup of a ConvNet, achieving a ten-fold increase over baseline. We also introduce a new class of fast one-dimensional (1D) convolutions for ConvNets using the Toom-Cook algorithm. We show that our proposed scheme is mathematically well-grounded, robust, and does not require any time-consuming retraining, while still achieving speedups solely from convolutional layers with no loss in baseline accuracy.
深度卷积神经网络(ConvNets)是许多新兴应用的核心,在音频和视觉识别任务中取得了显著的性能。不幸的是,要实现高精度往往意味着巨大的计算成本,这限制了其可部署性。在现代ConvNets中,推理过程中卷积层通常会消耗绝大部分计算资源。这使得加速这些层成为学术界和工业界的一个重要研究领域。在本文中,我们研究了协同优化卷积层内部结构和基本卷积操作底层实现的效果。我们证明,这些方法的组合可以对ConvNet的整体加速产生重大影响,比基线实现了十倍的提升。我们还为ConvNets引入了一类新的使用Toom-Cook算法的快速一维(1D)卷积。我们表明,我们提出的方案在数学上有充分的依据,稳健且不需要任何耗时的重新训练,同时仅从卷积层就能实现加速,且不会损失基线精度。