首页> 外文会议>Intelligent Robots and Systems, 2007 IEEE/RSJ International Conference on >Rapid behavior learning in multi-agent environment based on state value estimation of others
【24h】

Rapid behavior learning in multi-agent environment based on state value estimation of others

机译:基于其他人的状态值估计的多智能体环境中的快速行为学习

获取原文

摘要

The existing reinforcement learning approaches have been suffering from the curse of dimension problem when they are applied to multiagent dynamic environments. One of the typical examples is a case of RoboCup competitions since other agents and their behaviors easily cause state and action space explosion. This paper presents a method of modular learning in a multiagent environment by which the learning agent can acquire cooperative behaviors with its team mates and competitive ones against its opponents. The key ideas to resolve the issue are as follows. First, a two-layer hierarchical system with multi learning modules is adopted to reduce the size of the sensor and action spaces. The state space of the top layer consists of the state values from the lower level, and the macro actions are used to reduce the size of the physical action space. Second, the state of the other to what extent it is close to its own goal is estimated by observation and used as a state value in the top layer state space to realize the cooperative/competitive behaviors. The method is applied to 4 (defense team) on 5 (offense team) game task, and the learning agent successfully acquired the teamwork plays (pass and shoot) within much shorter learning time (30 times quicker than the earlier work).
机译:当将现有的强化学习方法应用于多主体动态环境时,就会遭受维度问题的困扰。典型的例子之一是RoboCup竞赛,因为其他代理及其行为容易引起状态空间和动作空间爆炸。本文提出了一种在多主体环境中进行模块化学习的方法,通过这种方法,学习主体可以获得与队友的合作行为以及与对手竞争的竞争行为。解决此问题的关键思想如下。首先,采用具有多个学习模块的两层分层系统来减小传感​​器和动作空间的大小。顶层的状态空间由较低级别的状态值组成,宏操作用于减小物理操作空间的大小。其次,通过观察来估计对方在何种程度上接近自己的目标,并将其用作顶层状态空间中的状态值,以实现合作/竞争行为。该方法适用于4个(防御团队)中的5个(进攻团队)游戏任务,并且学习代理在较短的学习时间内(比早期工作快30倍)成功获取了团队合作(传球和投篮)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号