Shan Dongjing, Luo Yong, Zhang Xiongwei, Zhang Chao
IEEE Trans Neural Netw Learn Syst. 2023 Apr;34(4):2057-2067. doi: 10.1109/TNNLS.2021.3105818. Epub 2023 Apr 4.
Recurrent neural networks (RNNs) continue to show outstanding performance in sequence learning tasks such as language modeling, but it remains difficult to train RNNs for long sequences. The main challenges lie in the complex dependencies, gradient vanishing or exploding, and low resource requirement in model deployment. In order to address these challenges, we propose dynamic recurrent routing neural networks (DRRNets), which can: 1) shorten the recurrent lengths by allocating recurrent routes dynamically for different dependencies and 2) reduce the number of parameters significantly by imposing low-rank constraints on the fully connected layers. A novel optimization algorithm via low-rank constraint and sparsity projection is developed to train the network. We verify the effectiveness of the proposed method by comparing it with multiple competitive approaches in several popular sequential learning tasks, such as language modeling and speaker recognition. The results in terms of different criteria demonstrate the superiority of our proposed method.
循环神经网络(RNN)在诸如语言建模等序列学习任务中持续展现出卓越的性能,但训练长序列的RNN仍然困难重重。主要挑战在于复杂的依赖关系、梯度消失或爆炸以及模型部署中的低资源需求。为了应对这些挑战,我们提出了动态循环路由神经网络(DRRNets),它能够:1)通过为不同的依赖关系动态分配循环路由来缩短循环长度;2)通过对全连接层施加低秩约束来显著减少参数数量。我们开发了一种通过低秩约束和稀疏投影的新型优化算法来训练网络。通过在几个流行的序列学习任务(如语言建模和说话人识别)中与多种竞争方法进行比较,我们验证了所提方法的有效性。根据不同标准得出的结果证明了我们所提方法的优越性。