The sparse feedback in reinforcement learning problems makes feature extraction difficult. The authors present importance-based feature extraction, which guides a bottom-up self-organization of feature detectors according to top-down information as to the importance of the features; the authors define importance in terms of the reinforcement values expected as a result of taking different actions when a feature is recognized. The authors illustrate these ideas in terms of the pole-balancing task and a learning system which combines bottom-up tuning with a distributed version of Q-learning; adding importance-based feature extraction to the detector tuning resulted in faster learning.
展开▼