TY - GEN
T1 - Learning context sensitive languages with LSTM trained with Kalman filters
AU - Gers, Felix A.
AU - Pérez-Ortiz, Juan Antonio
AU - Eck, Douglas
AU - Schmidhuber, Jürgen
N1 - Generated from Scopus record by KAUST IRTS on 2022-09-14
PY - 2002/1/1
Y1 - 2002/1/1
N2 - Unlike traditional recurrent neural networks, the Long Short-Term Memory (LSTM) model generalizes well when presented with training sequences derived from regular and also simple nonregular languages. Our novel combination of LSTM and the decoupled extended Kalman filter, however, learns even faster and generalizes even better, requiring only the 10 shortest exemplars (n ≤ 10) of the context sensitive language anbncn to deal correctly with values of n up to 1000 and more. Even when we consider the relatively high update complexity per timestep, in many cases the hybrid offers faster learning than LSTM by itself. © Springer-Verlag Berlin Heidelberg 2002.
AB - Unlike traditional recurrent neural networks, the Long Short-Term Memory (LSTM) model generalizes well when presented with training sequences derived from regular and also simple nonregular languages. Our novel combination of LSTM and the decoupled extended Kalman filter, however, learns even faster and generalizes even better, requiring only the 10 shortest exemplars (n ≤ 10) of the context sensitive language anbncn to deal correctly with values of n up to 1000 and more. Even when we consider the relatively high update complexity per timestep, in many cases the hybrid offers faster learning than LSTM by itself. © Springer-Verlag Berlin Heidelberg 2002.
UR - http://link.springer.com/10.1007/3-540-46084-5_107
UR - http://www.scopus.com/inward/record.url?scp=84902132412&partnerID=8YFLogxK
U2 - 10.1007/3-540-46084-5_107
DO - 10.1007/3-540-46084-5_107
M3 - Conference contribution
SN - 9783540440741
SP - 655
EP - 660
BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PB - Springer Verlag
ER -