Reinforcement learning is an efficient method for solving MarkovDecision Processes that an agent improves its performance by usingscalar reward value with higher capa- bility of reactive and adaptivebehaviors. Q-learning is repre- sentative reinforcement learningmethod which is guaranteed to obtain an optimal policy needs numeroustrials to achieve it. k-Certainty Exploration Learning Systemrealizes active ex- rated into two phases and estimate values are notderived during the process of identifying the environment.
展开▼