【24h】

Adaptive module acquisition in modular reinforcement learning

机译:模块化强化学习中的自适应模块采集

获取原文

摘要

This paper proposes an adaptive module acquisition for the modular reinforcement learning, where a learning agent starts with fundamental modules and acquires new modules during the learning if necessary. This relaxes the problem that it is difficult to know suitable module structure to accomplish the task in advance without a-priori knowledge of the problem. The criterion to introduce new modules is derived from fundamental aspect of reinforcement learning that the probability of the situation that values of states increase along the greedy policy becomes high after sufficient learning. The proposed method is implemented on Q-learning. It is applied to so-called "pursuit problem" simulated in a computer where two learning agents are navigated to catch a randomly moving object. As a result of computer simulations, the proposed method shows fairly good result in terms of better or the same performance with the less number of states compared to normal Q-learning or modular Q-learning without capability of acquiring new modules.
机译:本文提出了一种适应模块获取的模块化加强学习,其中学习代理以基本模块开始,并在学习期间获取新模块。这放松了这个问题,即难以知道合适的模块结构,以提前完成任务而没有先验的问题。引入新模块的标准是从加强学习的基本方面来源的,即在充足的学习后,沿着贪婪政策的州沿着贪婪政策增加的情况的可能性变得高。该方法在Q学习中实施。它应用于所谓的“追求问题”,模拟计算机中的两个学习代理被导航以捕获随机移动的物体。由于计算机仿真,所提出的方法在与正常Q学习或模块化Q学习相比,与较数较数的状态相比,较少或与较数较少的状态的表现相当好。没有获取新模块的能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号