School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China.
College of Information Science and Engineering, Henan University of Technology, Zhengzhou, 450001, China.
Neural Netw. 2019 Nov;119:286-298. doi: 10.1016/j.neunet.2019.08.015. Epub 2019 Aug 27.
Deep Neural Networks (DNNs) have achieved extraordinary success in numerous areas. However, DNNs often carry a large number of weight parameters, leading to the challenge of heavy memory and computation costs. Overfitting is another challenge for DNNs when the training data are insufficient. These challenges severely hinder the application of DNNs in resource-constrained platforms. In fact, many network weights are redundant and can be removed from the network without much loss of performance. In this paper, we introduce a new non-convex integrated transformed ℓ regularizer to promote sparsity for DNNs, which removes redundant connections and unnecessary neurons simultaneously. Specifically, we apply the transformed ℓ regularizer to the matrix space of network weights and utilize it to remove redundant connections. Besides, group sparsity is integrated to remove unnecessary neurons. An efficient stochastic proximal gradient algorithm is presented to solve the new model. To the best of our knowledge, this is the first work to develop a non-convex regularizer in sparse optimization based method to simultaneously promote connection-level and neuron-level sparsity for DNNs. Experiments on public datasets demonstrate the effectiveness of the proposed method.
深度神经网络 (DNN) 在许多领域取得了非凡的成功。然而,DNN 通常携带大量的权重参数,导致内存和计算成本的挑战。当训练数据不足时,过拟合是 DNN 的另一个挑战。这些挑战严重阻碍了 DNN 在资源受限平台上的应用。事实上,许多网络权重是冗余的,可以从网络中删除而不会造成太大的性能损失。在本文中,我们引入了一种新的非凸集成变换 ℓ 正则化方法来促进 DNN 的稀疏性,同时去除冗余连接和不必要的神经元。具体来说,我们将变换 ℓ 正则化应用于网络权重的矩阵空间,并利用它来去除冗余连接。此外,集成了组稀疏性以去除不必要的神经元。提出了一种有效的随机近端梯度算法来求解新模型。据我们所知,这是第一个在稀疏优化方法中开发非凸正则化的工作,旨在同时促进 DNN 的连接级和神经元级稀疏性。在公共数据集上的实验证明了所提出方法的有效性。