School of Computer Science and Technology, Hangzhou Dianzi University, China.
Whiting School of Engineering, Johns Hopkins University, USA.
Neural Netw. 2021 Dec;144:297-306. doi: 10.1016/j.neunet.2021.08.032. Epub 2021 Sep 6.
The recurrent network architecture is a widely used model in sequence modeling, but its serial dependency hinders the computation parallelization, which makes the operation inefficient. The same problem was encountered in serial adder at the early stage of digital electronics. In this paper, we discuss the similarities between recurrent neural network (RNN) and serial adder. Inspired by carry-lookahead adder, we introduce carry-lookahead module to RNN, which makes it possible for RNN to run in parallel. Then, we design the method of parallel RNN computation, and finally Carry-lookahead RNN (CL-RNN) is proposed. CL-RNN takes advantages in parallelism and flexible receptive field. Through a comprehensive set of tests, we verify that CL-RNN can perform better than existing typical RNNs in sequence modeling tasks which are specially designed for RNNs. Code and models are available at: https://github.com/WinnieJiangHW/Carry-lookahead_RNN.
递归网络架构是序列建模中广泛使用的模型,但它的串行依赖性阻碍了计算的并行化,使得操作效率低下。在数字电子学的早期,串行加法器也遇到了同样的问题。在本文中,我们讨论了递归神经网络 (RNN) 和串行加法器之间的相似之处。受先行进位加法器的启发,我们将先行进位模块引入 RNN,使其能够并行运行。然后,我们设计了并行 RNN 计算的方法,最后提出了先行进位 RNN (CL-RNN)。CL-RNN 具有并行性和灵活的感受野的优势。通过一系列全面的测试,我们验证了 CL-RNN 在为 RNN 专门设计的序列建模任务中比现有的典型 RNN 表现更好。代码和模型可在 https://github.com/WinnieJiangHW/Carry-lookahead_RNN 获得。