首页> 外文会议>IEEE Symposium Series on Computational Intelligence >Apprenticeship Learning for Continuous State Spaces and Actions in a Swarm-Guidance Shepherding Task
【24h】

Apprenticeship Learning for Continuous State Spaces and Actions in a Swarm-Guidance Shepherding Task

机译:在群体指导牧羊人任务中的连续状态空间和行动的学徒学习

获取原文

摘要

Apprenticeship learning (AL) is a learning scheme using demonstrations collected from human operators. Apprenticeship learning via inverse reinforcement learning (AL via IRL) has been used as one of the primary candidate approaches to obtain a near optimal policy that is as good as that of the human policy. The algorithm works by attempting to recover and approximate the human reward function from the demonstrations. This approach assists in overcoming limitations such as the sensitivity associated with the variance in the quality of human data and the short sighted decision time that does not consider future states. However, addressing the problem of continuous action and state spaces has still been challenging in the AL via IRL algorithm. In this paper, we propose a new AL via IRL approach that is able to work with continuous action and state spaces. Our approach is used to train an artificial intelligence (AI) agent acting as a shepherd of artificial sheep-inspired swarm agents in a complex and dynamic environment. The results show that the performance of our approach is as good as that of the human operator, and particularly, the agent's movements seem to be smoother and more effective.
机译:学徒学习(AL)是一种使用从人类运营商收集的示威活动的学习计划。通过逆强化学习(AL Via IRL)的学徒学习已被用作获得近最佳政策的主要候选方法之一,这与人类政策一样好。该算法通过尝试从演示中恢复和近似人工奖励功能。这种方法有助于克服诸如与人类数据质量方差相关的敏感性以及不考虑未来状态的短暂的决定时间。然而,解决连续动作和状态空间的问题仍然在AL通过IRL算法在AL中具有挑战性。在本文中,我们提出了一种新的AL,通过IRL方法能够使用连续动作和状态空间。我们的方法用于培训一种人工智能(AI)代理人,作为一种复杂的和动态环境中的人工绵羊激发的群体的牧羊人。结果表明,我们的方法的性能与人类运营商的表现一样好,特别是,代理商的动作似乎更平滑,更有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号