Abstract
In this paper we demonstrate that Long Short-Term Memory (LSTM) is a differentiable recurrent neural net (RNN) capable of robustly categorizing time-warped speech data. We measure its performance on a spoken digit identification task, where the data was spike-encoded in such a way that classifying the utterances became a difficult challenge in non-linear time-warping. We find that LSTM gives greatly superior results to an SNN found in the literature, and conclude that the architecture has a place in domains that require the learning of large timewarped datasets, such as automatic speech recognition.
Original language | English (US) |
---|---|
Title of host publication | Proceedings of the IASTED International Conference on Neural Networks and Computational Intelligence |
Pages | 164-168 |
Number of pages | 5 |
State | Published - Dec 1 2004 |
Externally published | Yes |