首页> 外文会议>Association for the Advancement of Artificial Intelligence Symposium >Effective Transfer via Demonstrations in Reinforcement Learning: A Preliminary Study
【24h】

Effective Transfer via Demonstrations in Reinforcement Learning: A Preliminary Study

机译:通过钢筋学习的示威活动有效转移:初步研究

获取原文

摘要

There are many successful methods for transferring information from one agent to another. One approach, taken in this work, is to have one (source) agent demonstrate a policy to a second (target) agent, and then have that second agent improve upon the policy. By allowing the target agent to observe the source agent's demonstrations, rather than relying on other types of direct knowledge transfer like Q-values, rules, or shared representations, we remove the need for the agents to know anything about each other's internal representation or have a shared language. In this work, we introduce a refinement to HAT, an existing transfer learning method, by integrating the target agent's confidence in its representation of the source agent's policy. Results show that a target agent can effectively 1) improve its initial performance relative to learning without transfer (jumpstart) and 2) improve its performance relative to the source agent (total reward). Furthermore, both the jumpstart and total reward are improved with this new refinement, relative to learning without transfer and relative to learning with HAT.
机译:有许多成功的方法,用于将信息从一个代理转移到另一个代理。在这项工作中采取的一种方法是让一个(源)代理商向第二(目标)代理商展示策略,然后将第二代理改进了政策。通过允许目标代理观察源代理的演示,而不是依赖于Q值,规则或共享表示等其他类型的直接知识转移,我们消除了代理商的需要了解彼此的内部表示或拥有的任何内容共享语言。在这项工作中,通过将目标代理人的信心集成在源代理政策的代表性方面,我们向帽子,现有的转移学习方法介绍了一种改进。结果表明,目标代理可以有效地提高其相对于学习的初始性能而无需转移(JumpStart),2)改善其相对于源代理的性能(总奖励)。此外,随着这种新细化,相对于学习而没有转移和戴帽子的学习,可以改善JumpStart和总奖励。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号