首页> 外文OA文献 >Experience-based Reinforcement Learning to Acquire Effective Behavior in a Multiagent Domain
【2h】

Experience-based Reinforcement Learning to Acquire Effective Behavior in a Multiagent Domain

机译:基于经验的强化学习,以获取多智能域中的有效行为

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Abstract. In this paper, we discuss Profit-sharing, an experience-based reinforcement learning approach (which is similar to a Monte-Carlo based reinforcement learning method) that can be used to learn robust and effective actions within uncertain, dynamic, multi-agent systems. We introduce the cut-loop routine that discards looping behavior, and demonstrate its effectiveness empirically within a simplified NEO (non-combatant evacuation operation) domain. This domain consists of several agents which ferry groups of evacuees to one of several shelters. We demonstrate that the cut-loop routine makes the Profit-sharing approach adaptive and robust within a dynamic and uncertain domain, without the need for pre-defined knowledge or subgoals. We also compare it empirically with the popular Q-learning approach.
机译:抽象。在本文中,我们讨论了基于经验的强化学习方法(类似于基于蒙特卡洛的强化学习方法)的利润分享,可用于学习不确定,动态,多主体系统中的有效措施。我们介绍了放弃循环行为的剪切循环程序,并在简化的NEO(非战斗人员疏散操作)域内凭经验证明了其有效性。该领域由几个特工组成,这些特工将疏散人员运送到几个庇护所之一。我们证明了割环程序使动态和不确定域内的利润共享方法具有适应性和鲁棒性,而无需预先定义的知识或子目标。我们还将它与流行的Q学习方法进行经验比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号