首页> 外文会议>IEEE Symposium Series on Computational Intelligence >Apprenticeship Learning for Continuous State Spaces and Actions in a Swarm-Guidance Shepherding Task
【24h】

Apprenticeship Learning for Continuous State Spaces and Actions in a Swarm-Guidance Shepherding Task

机译:群体指导牧羊任务中连续状态空间和动作的学徒学习

获取原文

摘要

Apprenticeship learning (AL) is a learning scheme using demonstrations collected from human operators. Apprenticeship learning via inverse reinforcement learning (AL via IRL) has been used as one of the primary candidate approaches to obtain a near optimal policy that is as good as that of the human policy. The algorithm works by attempting to recover and approximate the human reward function from the demonstrations. This approach assists in overcoming limitations such as the sensitivity associated with the variance in the quality of human data and the short sighted decision time that does not consider future states. However, addressing the problem of continuous action and state spaces has still been challenging in the AL via IRL algorithm. In this paper, we propose a new AL via IRL approach that is able to work with continuous action and state spaces. Our approach is used to train an artificial intelligence (AI) agent acting as a shepherd of artificial sheep-inspired swarm agents in a complex and dynamic environment. The results show that the performance of our approach is as good as that of the human operator, and particularly, the agent's movements seem to be smoother and more effective.
机译:学徒制学习(AL)是一种使用从人类操作员那里收集的示威游行的学习方案。通过逆向强化学习(通过IRL进行的AL)的学徒制学习已被用作获得与人类策略一样好的最佳策略的主要候选方法之一。该算法通过尝试从演示中恢复和逼近人类奖励功能而起作用。这种方法有助于克服局限性,例如与人类数据质量差异相关的敏感性以及不考虑未来状态的短视决策时间。然而,通过IRL算法在AL中解决连续动作和状态空间的问题仍然是挑战。在本文中,我们提出了一种新的通过IRL的AL方法,该方法能够处理连续的动作和状态空间。我们的方法用于在复杂而动态的环境中训练充当人工羊启发性群体代理的牧羊人的人工智能(AI)代理。结果表明,我们的方法的性能与操作员的性能一样好,尤其是代理的动作似乎更平滑,更有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号