Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress

机译：经验估算学习进步探讨基于模型的强化学习的探讨

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Formal exploration approaches in model-based reinforcement learning estimate the accuracy of the currently learned model without consideration of the empirical prediction error. For example, PAC-MDP approaches such as R-MAX base their model certainty on the amount of collected data, while Bayesian approaches assume a prior over the transition dynamics. We propose extensions to such approaches which drive exploration solely based on empirical estimates of the learner's accuracy and learning progress. We provide a "sanity check" theoretical analysis, discussing the behavior of our extensions in the standard stationary finite state-action case. We then provide experimental studies demonstrating the robustness of these exploration measures in cases of non-stationary environments or where original approaches are misled by wrong domain assumptions.

机译：基于模型的强化学习中的正式探索方法估计目前学习模型的准确性而不考虑经验预测误差。例如，PAC-MDP方法如R-Max基础上的模型确定在收集的数据量，而贝叶斯方法在过渡动态之前假设。我们提出了对这种方法的延伸，仅仅基于学习者的准确性和学习进步的实证估计。我们提供“理智检查”理论分析，讨论了我们在标准固定有限状态动作案件中的扩展的行为。然后，我们提供实验研究，证明在非静止环境的情况下这些勘探措施的稳健性，或者原始方法被错误的域假设误导。

著录项

来源
《Annual conference on Neural Information Processing Systems》|2012年||共9页
会议地点
作者
Manuel Lopes; Tobias Lang; Marc Toussaint; Pierre-Yves Oudeyer;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词
入库时间 2022-08-20 19:56:58

相似文献

外文文献
中文文献
专利

1. Model-free reinforcement learning with model-based safe exploration: Optimizing adaptive recovery process of infrastructure systems [J] . Memarzadeh Milad, Pozzi Matteo Structural Safety . 2019,第期

机译：基于模型的安全探索的无模型强化学习：优化基础设施系统的自适应恢复过程
2. Model-free reinforcement learning with model-based safe exploration: Optimizing adaptive recovery process of infrastructure systems [J] . Memarzadeh Milad, Pozzi Matteo Structural Safety . 2019,第期

机译：基于模型的安全探索的无模型加强学习：优化基础设施系统的自适应恢复过程
3. Exploration in Relational Domains for Model-based Reinforcement Learning [J] . Lang Tobias, Toussaint Marc, Kersting Kristian Journal of machine learning research . 2012,第Dec期

机译：基于模型的强化学习的关系域探索
4. Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress [C] . Manuel Lopes, Tobias Lang, Marc Toussaint, Annual conference on Neural Information Processing Systems . 2012

机译：通过经验估计学习进度探索基于模型的强化学习
5. Understanding Model-Based Reinforcement Learning and its Application in Safe Reinforcement Learning [D] . Hu, Dingcheng . 2019

机译：了解基于模型的强化学习及其在安全强化学习中的应用
6. Design Optimization of a Pneumatic Soft Robotic Actuator Using Model-Based Optimization and Deep Reinforcement Learning [O] . Mahsa Raeisinezhad, Nicholas Pagliocca, Behrad Koohbor, 2021

机译：基于模型的优化和深度加固学习的气动软机器人执行器设计优化
7. Bootstrap Estimated Uncertainty of the Environment Model for Model-Based Reinforcement Learning [O] . Wenzhen Huang, Junge Zhang, Kaiqi Huang 2019

机译：Bootstrap估计基于模型的增强学习环境模型的不确定性

Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress

摘要

著录项

相似文献

相关主题

期刊订阅