首页> 外文会议>IASTED International Conference on Intelligent Systems and Control >LEARNING IMITATION STRATEGIES USING COST-BASED POLICY MAPPING AND TASK REWARDS
【24h】

LEARNING IMITATION STRATEGIES USING COST-BASED POLICY MAPPING AND TASK REWARDS

机译:使用基于成本的策略映射和任务奖励学习仿制策略

获取原文

摘要

Learning by imitation represents a powerful approach for efficient learning and low-overhead programming. An important part of the imitation process is the mapping of observations to an executable control strategy. This is particularly important if the capabilities of the imitating and the demonstrating agent differ significantly. This paper presents an approach that addresses this problem by optimizing a cost function. The result is an executable strategy that as closely as possible resembles the observed effects of the demonstrator on the environment. To ensure that the imitating agent replicates the important aspects of the observed task, a learning component is introduced which learns the appropriate cost function from rewards obtained while executing the imitation strategy. The performance of this approach is illustrated within the context of a simulated multi-agent environment.
机译:仿真学习代表了有效学习和低开销编程的强大方法。仿制过程的一个重要部分是对可执行控制策略的观察映射。如果模拟的能力和说明剂显着差异,则这尤其重要。本文介绍了一种方法,通过优化成本函数来解决这个问题。结果是一种可执行的策略,尽可能地就像示威者对环境的观察到的效果一样。为了确保模仿代理复制观察到的任务的重要方面,介绍了一个学习组件,其在执行仿制策略时从获得的奖励中了解了适当的成本函数。在模拟的多代理环境的上下文中示出了这种方法的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号