Schmidhuber Jürgen, Wierstra Daan, Gagliolo Matteo, Gomez Faustino
IDSIA, 6928 Manno (Lugano), Switzerland.
Neural Comput. 2007 Mar;19(3):757-79. doi: 10.1162/neco.2007.19.3.757.
In recent years, gradient-based LSTM recurrent neural networks (RNNs) solved many previously RNN-unlearnable tasks. Sometimes, however, gradient information is of little use for training RNNs, due to numerous local minima. For such cases, we present a novel method: EVOlution of systems with LINear Outputs (Evolino). Evolino evolves weights to the nonlinear, hidden nodes of RNNs while computing optimal linear mappings from hidden state to output, using methods such as pseudo-inverse-based linear regression. If we instead use quadratic programming to maximize the margin, we obtain the first evolutionary recurrent support vector machines. We show that Evolino-based LSTM can solve tasks that Echo State nets (Jaeger, 2004a) cannot and achieves higher accuracy in certain continuous function generation tasks than conventional gradient descent RNNs, including gradient-based LSTM.
近年来,基于梯度的长短期记忆循环神经网络(RNN)解决了许多以前RNN无法学习的任务。然而,有时由于存在大量局部最小值,梯度信息对训练RNN用处不大。针对这种情况,我们提出了一种新颖的方法:具有线性输出的系统进化(Evolino)。Evolino在计算从隐藏状态到输出的最优线性映射时,将权重进化到RNN的非线性隐藏节点,使用基于伪逆的线性回归等方法。如果我们改用二次规划来最大化间隔,就得到了首个进化循环支持向量机。我们表明,基于Evolino的长短期记忆网络(LSTM)能够解决回声状态网络(Jaeger,2004a)无法解决的任务,并且在某些连续函数生成任务中比传统梯度下降RNN(包括基于梯度的LSTM)具有更高的准确率。