TY - JOUR
T1 - Training recurrent networks by evolino
AU - Schmidhuber, Jürgen
AU - Wierstra, Daan
AU - Gagliolo, Matteo
AU - Gomez, Faustino
N1 - Generated from Scopus record by KAUST IRTS on 2022-09-14
PY - 2007/3/1
Y1 - 2007/3/1
N2 - In recent years, gradient-based LSTM recurrent neural networks (RNNs) solved many previously RNN-unlearnable tasks. Sometimes, however, gradient information is of little use for training RNNs, due to numerous local minima. For such cases, we present a novel method: EVOlution of systems with LINear Outputs (Evolino). Evolino evolves weights to the nonlinear, hidden nodes of RNNs while computing optimal linear mappings from hidden state to output, using methods such as pseudo-inverse-based linear regression. If we instead use quadratic programming to maximize the margin, we obtain the first evolutionary recurrent support vector machines. We show that Evolino-based LSTM can solve tasks that Echo State nets (Jaeger, 2004a) cannot and achieves higher accuracy in certain continuous function generation tasks than conventional gradient descent RNNs, including gradient-based LSTM. © 2007 Massachusetts Institute of Technology.
AB - In recent years, gradient-based LSTM recurrent neural networks (RNNs) solved many previously RNN-unlearnable tasks. Sometimes, however, gradient information is of little use for training RNNs, due to numerous local minima. For such cases, we present a novel method: EVOlution of systems with LINear Outputs (Evolino). Evolino evolves weights to the nonlinear, hidden nodes of RNNs while computing optimal linear mappings from hidden state to output, using methods such as pseudo-inverse-based linear regression. If we instead use quadratic programming to maximize the margin, we obtain the first evolutionary recurrent support vector machines. We show that Evolino-based LSTM can solve tasks that Echo State nets (Jaeger, 2004a) cannot and achieves higher accuracy in certain continuous function generation tasks than conventional gradient descent RNNs, including gradient-based LSTM. © 2007 Massachusetts Institute of Technology.
UR - https://direct.mit.edu/neco/article/19/3/757-779/7156
UR - http://www.scopus.com/inward/record.url?scp=33847649288&partnerID=8YFLogxK
U2 - 10.1162/neco.2007.19.3.757
DO - 10.1162/neco.2007.19.3.757
M3 - Article
SN - 0899-7667
VL - 19
SP - 757
EP - 779
JO - Neural Computation
JF - Neural Computation
IS - 3
ER -