首页> 外文会议>European Conference on Artificial Intelligence >Uncertainty Propagation for Efficient Exploration in Reinforcement Learning
【24h】

Uncertainty Propagation for Efficient Exploration in Reinforcement Learning

机译:加固学习高效探索的不确定性繁殖

获取原文

摘要

Reinforcement learning aims to derive an optimal policy for an often initially unknown environment. In the case of an unknown environment, exploration is used to acquire knowledge about it. In that context the well-known exploration-exploitation dilemma arises-when should one stop to explore and instead exploit the knowledge already gathered? In this paper we propose an uncertainty-based exploration method. We use uncertainty propagation to obtain the Q-function's uncertainty and then use the uncertainty in combination with the Q-values to guide the exploration to promising states that so far have been insufficiently explored. The uncertainty's weight during action selection can be influenced by a parameter. We evaluate one variant of the algorithm using full covariance matrices and two variants using an approximation and demonstrate their functionality on two benchmark problems.
机译:强化学习旨在为通常最初未知的环境推导出最佳政策。在未知环境的情况下,探索用于获得关于它的知识。在那种情况下,众所周知的探索剥削困境 - 当应该停止探索并剥削已经收集的知识时应该探索?在本文中,我们提出了一种基于不确定性的探索方法。我们使用不确定性传播来获得Q函数的不确定性,然后使用不确定性与Q值结合使用,以指导探索到目前为止探索的承诺状态。行动选择期间的不确定性的重量可能会受到参数的影响。我们使用近似使用完整的协方差矩阵和两个变体评估算法的一个变体,并在两个基准问题上展示其功能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号