Advice-Based Exploration in Model-Based Reinforcement Learning

机译：基于模型的强化学习中基于建议的探索

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Convergence to an optimal policy using model-based reinforcement learning can require significant exploration of the environment. In some settings such exploration is costly or even impossible, such as in cases where simulators are not available, or where there are prohibitively large state spaces. In this paper we examine the use of advice to guide the search for an optimal policy. To this end we propose a rich language for providing advice to a reinforcement learning agent. Unlike constraints which potentially eliminate optimal policies, advice offers guidance for the exploration, while preserving the guarantee of convergence to an optimal policy. Experimental results on deterministic grid worlds demonstrate the potential for good advice to reduce the amount of exploration required to learn a satisficing or optimal policy, while maintaining robustness in the face of incomplete or misleading advice.

机译：使用基于模型的强化学习收敛到最佳策略可能需要对环境进行大量探索。在某些情况下，这种探索是昂贵的，甚至是不可能的，例如在没有模拟器或状态空间过大的情况下。在本文中，我们研究了使用建议来指导寻找最佳政策的方法。为此，我们提出了一种丰富的语言，用于向强化学习代理提供建议。与可能消除最佳策略的约束不同，建议为探索提供了指导，同时保留了与最佳策略融合的保证。在确定性网格世界上的实验结果表明，良好的建议可能会减少学习满意或最佳策略所需的探索量，同时在面对不完整或误导性建议的情况下仍能保持稳健性。

著录项

来源
《Canadian conference on artificial intelligence》|2018年|72-83|共12页
会议地点
作者
Rodrigo Toro Icarte; Toryn Q. Klassen; Richard Anthony Valenzano; Sheila A. Mcllraith;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Markov decision process; Reinforcement learning Model-based learning; Linear temporal logic; Advice;

机译：马尔可夫决策过程;强化学习基于模型的学习;线性时序逻辑;忠告;

相似文献

外文文献
中文文献
专利

1. Model-free reinforcement learning with model-based safe exploration: Optimizing adaptive recovery process of infrastructure systems [J] . Memarzadeh Milad, Pozzi Matteo Structural Safety . 2019,第期

机译：基于模型的安全探索的无模型强化学习：优化基础设施系统的自适应恢复过程
2. Model-free reinforcement learning with model-based safe exploration: Optimizing adaptive recovery process of infrastructure systems [J] . Memarzadeh Milad, Pozzi Matteo Structural Safety . 2019,第期

机译：基于模型的安全探索的无模型加强学习：优化基础设施系统的自适应恢复过程
3. Exploration in Relational Domains for Model-based Reinforcement Learning [J] . Lang Tobias, Toussaint Marc, Kersting Kristian Journal of machine learning research . 2012,第Dec期

机译：基于模型的强化学习的关系域探索
4. Advice-Based Exploration in Model-Based Reinforcement Learning [C] . Rodrigo Toro Icarte, Toryn Q. Klassen, Richard Anthony Valenzano, Canadian Conference on Artificial Intelligence . 2018

机译：基于咨询基于模型的强化学习探索
5. Understanding Model-Based Reinforcement Learning and its Application in Safe Reinforcement Learning [D] . Hu, Dingcheng . 2019

机译：了解基于模型的强化学习及其在安全强化学习中的应用
6. Design Optimization of a Pneumatic Soft Robotic Actuator Using Model-Based Optimization and Deep Reinforcement Learning [O] . Mahsa Raeisinezhad, Nicholas Pagliocca, Behrad Koohbor, 2021

机译：基于模型的优化和深度加固学习的气动软机器人执行器设计优化
7. Model-based reinforcement learning under concurrent schedules of reinforcement in rodents [O] . N. Huh, S. Jo, H. Kim, 2009

机译：基于模型的加固在啮齿动物加固时间表下的钢筋学习

Advice-Based Exploration in Model-Based Reinforcement Learning

摘要

著录项

相似文献

相关主题

期刊订阅