首页> 外文期刊>Artificial intelligence >Multi-robot inverse reinforcement learning under occlusion with estimation of state transitions
【24h】

Multi-robot inverse reinforcement learning under occlusion with estimation of state transitions

机译:闭塞状态下的多机器人逆强化学习

获取原文
获取原文并翻译 | 示例

摘要

Inverse reinforcement learning (IRL), analogously to RL, refers to both the problem and associated methods by which an agent passively observing another agent's actions over time, seeks to learn the latter's reward function. The learning agent is typically called the learner while the observed agent is often an expert in popular applications such as in learning from demonstrations. Some of the assumptions that underlie current IRL methods are impractical for many robotic applications. Specifically, they assume that the learner has full observability of the expert as it performs its task; that the learner has full knowledge of the expert's dynamics; and that there is always only one expert agent in the environment. For example, these assumptions are particularly restrictive in our application scenario where a subject robot is tasked with penetrating a perimeter patrol by two other robots after observing them from a vantage point. In our instance of this problem, the learner can observe at most 10% of the patrol.
机译:逆强化学习(IRL)与RL类似,是指问题和相关方法,代理商通过这种方法被动地观察另一位代理商的行为,以寻求学习后者的奖励功能。学习代理通常称为学习者,而观察到的代理通常是流行应用程序中的专家,例如从演示中学习。对于许多机器人应用而言,当前基于IRL方法的一些假设是不切实际的。具体来说,他们假设学习者在执行任务时具有专家的完全可观察性;学习者完全了解专家的动态;并且环境中始终只有一个专家代理。例如,这些假设在我们的应用场景中特别受限制,在该应用场景中,目标机器人的任务是在从有利位置观察后再由其他两个机器人穿透外围巡逻。在我们这个问题的实例中,学习者最多可以观察到10%的巡逻。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号