首页> 外文会议>Canadian conference on artificial intelligence >Advice-Based Exploration in Model-Based Reinforcement Learning
【24h】

Advice-Based Exploration in Model-Based Reinforcement Learning

机译:基于模型的强化学习中基于建议的探索

获取原文

摘要

Convergence to an optimal policy using model-based reinforcement learning can require significant exploration of the environment. In some settings such exploration is costly or even impossible, such as in cases where simulators are not available, or where there are prohibitively large state spaces. In this paper we examine the use of advice to guide the search for an optimal policy. To this end we propose a rich language for providing advice to a reinforcement learning agent. Unlike constraints which potentially eliminate optimal policies, advice offers guidance for the exploration, while preserving the guarantee of convergence to an optimal policy. Experimental results on deterministic grid worlds demonstrate the potential for good advice to reduce the amount of exploration required to learn a satisficing or optimal policy, while maintaining robustness in the face of incomplete or misleading advice.
机译:使用基于模型的强化学习收敛到最佳策略可能需要对环境进行大量探索。在某些情况下,这种探索是昂贵的,甚至是不可能的,例如在没有模拟器或状态空间过大的情况下。在本文中,我们研究了使用建议来指导寻找最佳政策的方法。为此,我们提出了一种丰富的语言,用于向强化学习代理提供建议。与可能消除最佳策略的约束不同,建议为探索提供了指导,同时保留了与最佳策略融合的保证。在确定性网格世界上的实验结果表明,良好的建议可能会减少学习满意或最佳策略所需的探索量,同时在面对不完整或误导性建议的情况下仍能保持稳健性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号