Unsupervised modeling of partially observable environments

Vincent Graziano, Jan Koutník, Jürgen Schmidhuber

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

We present an architecture based on self-organizing maps for learning a sensory layer in a learning system. The architecture, temporal network for transitions (TNT), enjoys the freedoms of unsupervised learning, works on-line, in non-episodic environments, is computationally light, and scales well. TNT generates a predictive model of its internal representation of the world, making planning methods available for both the exploitation and exploration of the environment. Experiments demonstrate that TNT learns nice representations of classical reinforcement learning mazes of varying size (up to 20 x 20) under conditions of high-noise and stochastic actions. © 2011 Springer-Verlag.
Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages503-515
Number of pages13
DOIs
StatePublished - Sep 9 2011
Externally publishedYes

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Unsupervised modeling of partially observable environments'. Together they form a unique fingerprint.

Cite this