Reinforcement Learning (RL) is a popular approach for solving increasing number of problems. However, standard RL approach has many deficiencies. In this paper multiple approaches for addressing those deficiencies by incorporating Supervised Learning are discussed and a new approach, Reinforcement Learning with Adaptive Supervisor, is proposed. In this model, actions chosen by the RL method are rated by the supervisor and may be replaced with safer ones. The supervisor observes the results of each action and on that basis it learns the knowledge about safety of actions in various states. It helps to overcome one of the Reinforcement Learning deficiencies - risk of wrong action execution. The new approach is designed for domains, where failures are very expensive. The architecture was evaluated on a car intersection model. The proposed method eliminated around 50% of failures.
展开▼