TY - GEN
T1 - Artificial curiosity based on discovering novel algorithmic predictability through coevolution
AU - Schmidhuber, Jürgen
N1 - Generated from Scopus record by KAUST IRTS on 2022-09-14
PY - 1999/1/1
Y1 - 1999/1/1
N2 - One explores a spatio-temporal domain by predicting and learning from success/failure what's predictable and what's not. The author studies a «curious» embedded agent that differs from previous explorers in the sense that it can limit its predictions to fairly arbitrary, computable aspects of event sequences and thus can explicitly ignore almost arbitrary unpredictable, random aspects. It constructs initially random algorithms mapping event sequences to abstract internal representations (IRs). It also constructs algorithms predicting IRs from IRs computed earlier. It wants to learn novel algorithms creating IRs useful for correct IR predictions, without wasting time on those learned before. This is achieved by a co-evolutionary scheme involving two competing modules co-evolutionary designing single algorithms to be executed. The modules can bet on the outcome of IR predictions computed by the algorithms they have agreed upon. If their opinions differ then the system checks who's right, punishes the loser (the surprised one), and rewards the winner. A reinforcement learning algorithm forces each module to maximise reward. This motivates both modules to lure the other into agreeing upon algorithms involving predictions that surprise it. Since each module essentially can put in its veto against algorithms it does not consider profitable, the system is motivated to focus on those computable aspects of the environment where both modules still have confident but different opinions. Once both share the same opinion on a particular issue, the winner loses a source of reward-an incentive to shift the focus of interest onto novel, yet unknown algorithms. © 1999 IEEE.
AB - One explores a spatio-temporal domain by predicting and learning from success/failure what's predictable and what's not. The author studies a «curious» embedded agent that differs from previous explorers in the sense that it can limit its predictions to fairly arbitrary, computable aspects of event sequences and thus can explicitly ignore almost arbitrary unpredictable, random aspects. It constructs initially random algorithms mapping event sequences to abstract internal representations (IRs). It also constructs algorithms predicting IRs from IRs computed earlier. It wants to learn novel algorithms creating IRs useful for correct IR predictions, without wasting time on those learned before. This is achieved by a co-evolutionary scheme involving two competing modules co-evolutionary designing single algorithms to be executed. The modules can bet on the outcome of IR predictions computed by the algorithms they have agreed upon. If their opinions differ then the system checks who's right, punishes the loser (the surprised one), and rewards the winner. A reinforcement learning algorithm forces each module to maximise reward. This motivates both modules to lure the other into agreeing upon algorithms involving predictions that surprise it. Since each module essentially can put in its veto against algorithms it does not consider profitable, the system is motivated to focus on those computable aspects of the environment where both modules still have confident but different opinions. Once both share the same opinion on a particular issue, the winner loses a source of reward-an incentive to shift the focus of interest onto novel, yet unknown algorithms. © 1999 IEEE.
UR - http://ieeexplore.ieee.org/document/785467/
UR - http://www.scopus.com/inward/record.url?scp=84901391337&partnerID=8YFLogxK
U2 - 10.1109/CEC.1999.785467
DO - 10.1109/CEC.1999.785467
M3 - Conference contribution
SP - 1612
EP - 1618
BT - Proceedings of the 1999 Congress on Evolutionary Computation, CEC 1999
PB - IEEE Computer [email protected]
ER -