首页> 外文会议>Annual conference on Neural Information Processing Systems >Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress
【24h】

Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress

机译:经验估算学习进步探讨基于模型的强化学习的探讨

获取原文

摘要

Formal exploration approaches in model-based reinforcement learning estimate the accuracy of the currently learned model without consideration of the empirical prediction error. For example, PAC-MDP approaches such as R-MAX base their model certainty on the amount of collected data, while Bayesian approaches assume a prior over the transition dynamics. We propose extensions to such approaches which drive exploration solely based on empirical estimates of the learner's accuracy and learning progress. We provide a "sanity check" theoretical analysis, discussing the behavior of our extensions in the standard stationary finite state-action case. We then provide experimental studies demonstrating the robustness of these exploration measures in cases of non-stationary environments or where original approaches are misled by wrong domain assumptions.
机译:基于模型的强化学习中的正式探索方法估计目前学习模型的准确性而不考虑经验预测误差。例如,PAC-MDP方法如R-Max基础上的模型确定在收集的数据量,而贝叶斯方法在过渡动态之前假设。我们提出了对这种方法的延伸,仅仅基于学习者的准确性和学习进步的实证估计。我们提供“理智检查”理论分析,讨论了我们在标准固定有限状态动作案件中的扩展的行为。然后,我们提供实验研究,证明在非静止环境的情况下这些勘探措施的稳健性,或者原始方法被错误的域假设误导。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号