Abstract
This paper describes a method for hierarchical reinforcement learning in which high-level policies automatically discover subgoals, and low-level policies learn to specialize for different subgoals. Subgoals are represented as desired abstract observations which cluster raw input data. High-level value functions cover the state space at a coarse level; low-level value functions cover only parts of the state space at a fine-grained level. An experiment shows that this method outperforms several flat reinforcement learning methods. A second experiment shows how problems of partial observability due to observation abstraction can be overcome using high-level policies with memory.
Original language | English (US) |
---|---|
Title of host publication | Proceedings of the IASTED International Conference on Neural Networks and Computational Intelligence |
Pages | 125-130 |
Number of pages | 6 |
State | Published - Dec 1 2004 |
Externally published | Yes |