首页> 外文会议>Annual conference on Neural Information Processing Systems >Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search
【24h】

Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search

机译:使用基于样本的搜索进行有效的贝叶斯自适应强化学习

获取原文

摘要

Bayesian model-based reinforcement learning is a formally elegant approach to learning optimal behaviour under model uncertainty, trading off exploration and exploitation in an ideal way. Unfortunately, finding the resulting Bayes-optimal policies is notoriously taxing, since the search space becomes enormous. In this paper we introduce a tractable, sample-based method for approximate Bayes-optimal planning which exploits Monte-Carlo tree search. Our approach outperformed prior Bayesian model-based RL algorithms by a significant margin on several well-known benchmark problems - because it avoids expensive applications of Bayes rule within the search tree by lazily sampling models from the current beliefs. We illustrate the advantages of our approach by showing it working in an infinite state space domain which is qualitatively out of reach of almost all previous work in Bayesian exploration.
机译:基于贝叶斯模型的强化学习是在模型不确定性下学习最优行为的一种形式上优雅的方法,以一种理想的方式在探索和利用之间进行权衡。不幸的是,众所周知,找到最终的贝叶斯最优策略非常麻烦,因为搜索空间变得很大。在本文中,我们介绍了一种基于样本的可处理的,近似的贝叶斯最优规划的方法,该方法利用了蒙特卡洛树搜索。我们的方法在几个著名的基准问题上大大优于先前的基于贝叶斯模型的RL算法-因为它通过从当前信念中对模型进行惰性采样避免了搜索树中贝叶斯规则的昂贵应用。我们通过显示它在无限状态空间域中工作来说明我们的方法的优势,这在质量上几乎是贝叶斯探索中几乎所有以前的工作所无法企及的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号