TY - GEN
T1 - Multimodal parameter-exploring policy gradients
AU - Sehnke, Frank
AU - Graves, Alex
AU - Osendorfer, Christian
AU - Schmidhuber, Jürgen
N1 - Generated from Scopus record by KAUST IRTS on 2022-09-14
PY - 2010/12/1
Y1 - 2010/12/1
N2 - Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estimates encountered in normal policy gradient methods. It has been shown to drastically speed up convergence for several large-scale reinforcement learning tasks. However the independent normal distributions used by PGPE to search through parameter space are inadequate for some problems with multimodal reward surfaces. This paper extends the basic PGPE algorithm to use multimodal mixture distributions for each parameter, while remaining efficient. Experimental results on the Rastrigin function and the inverted pendulum benchmark demonstrate the advantages of this modification, with faster convergence to better optima. © 2010 IEEE.
AB - Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estimates encountered in normal policy gradient methods. It has been shown to drastically speed up convergence for several large-scale reinforcement learning tasks. However the independent normal distributions used by PGPE to search through parameter space are inadequate for some problems with multimodal reward surfaces. This paper extends the basic PGPE algorithm to use multimodal mixture distributions for each parameter, while remaining efficient. Experimental results on the Rastrigin function and the inverted pendulum benchmark demonstrate the advantages of this modification, with faster convergence to better optima. © 2010 IEEE.
UR - http://ieeexplore.ieee.org/document/5708821/
UR - http://www.scopus.com/inward/record.url?scp=79952387252&partnerID=8YFLogxK
U2 - 10.1109/ICMLA.2010.24
DO - 10.1109/ICMLA.2010.24
M3 - Conference contribution
SN - 9780769543000
SP - 113
EP - 118
BT - Proceedings - 9th International Conference on Machine Learning and Applications, ICMLA 2010
ER -