TY - GEN
T1 - Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks
AU - Graves, Alex
AU - Fernández, Santiago
AU - Gomez, Faustino
AU - Schmidhuber, Jürgen
N1 - Generated from Scopus record by KAUST IRTS on 2022-09-14
PY - 2006/12/1
Y1 - 2006/12/1
N2 - Many real-world sequence learning tasks require the prediction of sequences of labels from noisy, unsegmented input data. In speech recognition, for example, an acoustic signal is transcribed into words or sub-word units. Recurrent neural networks (RNNs) are powerful sequence learners that would seem well suited to such tasks. However, because they require pre-segmented training data, and post-processing to transform their outputs into label sequences, their applicability has so far been limited. This paper presents a novel method for training RNNs to label unsegmented sequences directly, thereby solving both problems. An experiment on the TIMIT speech corpus demonstrates its advantages over both a baseline HMM and a hybrid HMM-RNN.
AB - Many real-world sequence learning tasks require the prediction of sequences of labels from noisy, unsegmented input data. In speech recognition, for example, an acoustic signal is transcribed into words or sub-word units. Recurrent neural networks (RNNs) are powerful sequence learners that would seem well suited to such tasks. However, because they require pre-segmented training data, and post-processing to transform their outputs into label sequences, their applicability has so far been limited. This paper presents a novel method for training RNNs to label unsegmented sequences directly, thereby solving both problems. An experiment on the TIMIT speech corpus demonstrates its advantages over both a baseline HMM and a hybrid HMM-RNN.
UR - http://portal.acm.org/citation.cfm?doid=1143844.1143891
UR - http://www.scopus.com/inward/record.url?scp=34250704813&partnerID=8YFLogxK
U2 - 10.1145/1143844.1143891
DO - 10.1145/1143844.1143891
M3 - Conference contribution
SN - 1595933832
SP - 369
EP - 376
BT - ACM International Conference Proceeding Series
ER -