Framewise phoneme classification with bidirectional LSTM networks

Alex Graves, Jürgen Schmidhuber

Research output: Chapter in Book/Report/Conference proceedingConference contribution

480 Scopus citations

Abstract

In this paper, we apply bidirectional training to a Long Short Term Memory (LSTM) network for the first time. We also present a modified, full gradient version of the LSTM learning algorithm. We discuss the significance of framewise phoneme classification to continuous speech recognition, and the validity of using bidirectional networks for online causal tasks. On the TIMIT speech database, we measure the framewise phoneme classification scores of bidirectional and unidirectional variants of both LSTM and conventional Recurrent Neural Networks (RNNs). We find that bidirectional LSTM outperforms both RNNs and unidirectional LSTM. © 2005 IEEE.
Original languageEnglish (US)
Title of host publicationProceedings of the International Joint Conference on Neural Networks
Pages2047-2052
Number of pages6
DOIs
StatePublished - Dec 1 2005
Externally publishedYes

Fingerprint

Dive into the research topics of 'Framewise phoneme classification with bidirectional LSTM networks'. Together they form a unique fingerprint.

Cite this