Options in Multi-task Reinforcement Learning - Transfer via Reflection

机译：多任务强化学习中的选项-通过反思进行转移

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Temporally extended actions such as options are known to lead to improvements in reinforcement learning (RL). At the same time, transfer learning across different RL tasks is an increasingly active area of research. Following Baxter's formalism for transfer, the corresponding RL question considers the benefit that an RL agent can achieve on new tasks based on experience from previous tasks in a common 'learning environment'. We address this in the specific context of goal-based multitask RL, where the different tasks correspond to different goal states within a common state space, and we introduce Landmark Options Via Reflection (LOVR), a flexible framework that uses options to transfer domain knowledge. As an explicit analog of principles in transfer learning, we provide theoretical and empirical results demonstrating that when a set of landmark states covers the state space suitably, then a LOVR agent that learns optimal value functions for these in an initial phase and deploys the associated optimal policies as options in the main phase, can achieve a drastic reduction in cumulative regret compared to baseline approaches.

机译：众所周知，诸如选项之类的临时扩展动作会导致强化学习（RL）的改善。同时，跨不同RL任务的转移学习是一个日益活跃的研究领域。遵循百特的转学形式主义，相应的RL问题考虑了RL代理可以基于常见“学习环境”中以前任务的经验来完成新任务的收益。我们在基于目标的多任务RL的特定上下文中解决此问题，在该特定上下文中，不同的任务对应于一个公共状态空间中的不同目标状态，并且我们引入了“通过反射进行地标选择”（LOVR），该框架使用选项来转移领域知识。作为转移学习中的原理的明确模拟，我们提供理论和经验结果，证明当一组界标状态适当覆盖状态空间时，LOVR代理会在初始阶段为这些学习最佳值函数并部署相关的最优值。与基准方法相比，将政策作为主要阶段的选择可以大大减少累积遗憾。

著录项

来源
《Canadian Conference on Artificial Intelligence》|2019年|225-237|共13页
会议地点 Kingston(CA)
作者
Nicholas Denis; Maia Fraser;
展开▼
作者单位

Department of Mathematics and Statistics University of Ottawa Ottawa Canada;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Learning potential functions and their representations for multi-task reinforcement learning [J] . Matthijs Snel, Shimon Whiteson Autonomous agents and multi-agent systems . 2014,第4期

机译：学习潜在功能及其表示，以进行多任务强化学习
2. Evolving hierarchical memory-prediction machines in multi-task reinforcement learning [J] . Kelly Stephen, Voegerl Tatiana, Banzhaf Wolfgang, Genetic programming and evolvable machines . 2021,第4期

机译：在多任务强化学习中发展的分层内存预测机器
3. Multi-task deep reinforcement learning for intelligent multi-zone residential HVAC control [J] . Du Yan, Li Fangxing, Munk Jeffrey, Electric power systems research . 2021,第Mara期

机译：智能多区住宅HVAC控制多任务深度加固学习
4. Options in Multi-task Reinforcement Learning - Transfer via Reflection [C] . Nicholas Denis, Maia Fraser Canadian Conference on Artificial Intelligence . 2019

机译：多任务强化学习选项 - 通过反射传输
5. Multi-Task Generalization Using Practice for Distributed Deep Reinforcement Learning [D] . Pattnaik, Upasana. 2021

机译：多任务泛化使用分布式深度加强学习的实践
6. Learning for a Robot: Deep Reinforcement Learning Imitation Learning Transfer Learning [O] . Jiang Hua, Liangcai Zeng, Gongfa Li, 2021

机译：学习机器人：深增强学习仿制学习转移学习
7. Learning relational options for inductive transfer in relational reinforcement learning [O] . Tom Croonenborghs, Kurt Driessens, Maurice Bruynooghe 2010

机译：学习关系强化学习中归纳转移的关系选择

Options in Multi-task Reinforcement Learning - Transfer via Reflection

摘要

著录项

相似文献

相关主题

期刊订阅