【24h】

Active Imitation Learning

机译:主动模仿学习

获取原文
获取原文并翻译 | 示例

摘要

Imitation learning, also called learning by watching or programming by demonstration, has emerged as a means of accelerating many reinforcement learning tasks. Previous work has shown the value of imitation in domains where a single mentor demonstrates execution of a known optimal policy for the benefit of a learning agent. We consider the more general scenario of learning from mentors who are themselves agents seeking to maximize their own rewards. We propose a new algorithm based on the concept of transferable utility for ensuring that an observer agent can learn efficiently in the context of a selfish, not necessarily helpful, mentor. We also address the questions of when an imitative agent should request help from a mentor, and when the mentor can be expected to acknowledge a request for help. In analogy with other types of active learning, we call the proposed approach active imitation learning.
机译:模仿学习,也称为观看学习或示范编程,已成为加速许多强化学习任务的一种手段。先前的工作已经表明了模仿的价值,在该领域中,单个指导者演示了为学习代理的利益而执行已知的最佳策略。我们考虑从导师那里学习的更一般情况,导师本身就是寻求最大化自己的报酬的代理商。我们提出了一种基于可转移效用的概念的新算法,以确保观察者代理可以在自私,不一定有用的导师的情况下有效学习。我们还将解决以下问题:模拟代理人何时应向导师寻求帮助,以及何时可以期望导师确认求助请求。与其他类型的主动学习类似,我们将提出的方法称为主动模仿学习。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号