Abstract
This paper discusses parameter-based exploration methods for reinforcement learning. Parameter-based methods perturb parameters of a general function approximator directly, rather than adding noise to the resulting actions. Parameter-based exploration unifies reinforcement learning and black-box optimization, and has several advantages over action perturbation. We review two recent parameter-exploring algorithms: Natural Evolution Strategies and Policy Gradients with Parameter-Based Exploration. Both outperform state-of-the-art algorithms in several complex high-dimensional tasks commonly found in robot control. Furthermore, we describe how a novel exploration method, State-Dependent Exploration, can modify existing algorithms to mimic exploration in parameter space.
Original language | English (US) |
---|---|
Pages (from-to) | 14-24 |
Number of pages | 11 |
Journal | Paladyn |
Volume | 1 |
Issue number | 1 |
DOIs | |
State | Published - Mar 1 2010 |
Keywords
- exploration
- optimization
- policy gradients
- reinforcement learning
ASJC Scopus subject areas
- Human-Computer Interaction
- Developmental Neuroscience
- Cognitive Neuroscience
- Artificial Intelligence
- Behavioral Neuroscience