首页> 外文会议>RoboCup International Symposium >Learning Complementary Multiagent Behaviors: A Case Study
【24h】

Learning Complementary Multiagent Behaviors: A Case Study

机译:学习互补的多层行为:一个案例研究

获取原文

摘要

As machine learning is applied to increasingly complex tasks, it is likely that the diverse challenges encountered can only be addressed by combining the strengths of different learning algorithms. We examine this aspect of learning through a case study grounded in the robot soccer context. The task we consider is Keepaway, a popular benchmark for multiagent reinforcement learning from the simulation soccer domain. Whereas previous successful results in Keepaway have limited learning to an isolated, infrequent decision that amounts to a turn-taking behavior (passing), we expand the agents' learning capability to include a much more ubiquitous action (moving without the ball, or getting open), such that at any given time, multiple agents are executing learned behaviors simultaneously. We introduce a policy search method for learning "GETOPEN" to complement the temporal difference learning approach employed for learning "PASS". Empirical results indicate that the learned GETOPEN policy matches the best hand-coded policy for this task, and outperforms the best policy found when PASS is learned. We demonstrate that PASS and GETOPEN can be learned simultaneously to realize tightly-coupled soccer team behavior.
机译:由于机器学习应用于越来越复杂的任务,因此只能通过组合不同学习算法的优点来解决各种挑战。我们通过在机器人足球背景下的案例研究来检查学习的这一方面。我们考虑的任务是Leepaway,这是一种从模拟足球域中获得多层强化学习的流行基准。虽然以前的成功结果在Leepaway中有限地学习了一个孤立的,但少量的决定,这相当于转动行为(通过),我们将代理商的学习能力扩展到包括更无处不在的行动(在没有球的情况下移动,或者开放),使得在任何给定的时间,多个代理正在同时执行学习行为。我们介绍了一种学习“GetoPen”的政策搜索方法,以补充学习“通过”的时间差异学习方法。经验结果表明,学习的GetoPen策略与此任务的最佳手工编码策略匹配,并且优越地赢得了通过何时发现的最佳政策。我们证明了通过和葛靠,可以同时学习,以实现紧密耦合的足球队行为。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号