Abstract
We use reinforcement learning (RL) to compute strategies for multiagent soccer teams. RL may profit significantly from world models (WMs) estimating state transition probabilities and rewards. In high-dimensional, continuous input spaces, however, learning accurate WMs is intractable. Here we show that incomplete WMs can help to quickly find good action selection policies. Our approach is based on a novel combination of CMACs and prioritized sweeping-like algorithms. Variants thereof outperform both Q(λ)-learning with CMACs and the evolutionary method Probabilistic Incremental Program Evolution (PIPE) which performed best in previous comparisons.
Original language | English (US) |
---|---|
Pages (from-to) | 77-88 |
Number of pages | 12 |
Journal | Autonomous Robots |
Volume | 7 |
Issue number | 1 |
DOIs | |
State | Published - Jan 1 1999 |
Externally published | Yes |
ASJC Scopus subject areas
- Artificial Intelligence