【24h】

Evolution-based Discovery of Hierarchical Behaviors

机译:基于进化的分层行为发现

获取原文

摘要

Procedural representations of control policies have two advantages when facing the scale-up problem in learning tasks. First they are implicit, with potential for inductive generalization over a very large set of situations. Second they facilitate modularization. In this paper we compare several randomized algorithms for learnhing modular procedural representations. The main algorithm, called Adaptive Representation through Learning (ARL) is a genetic programming extension that relies on the discovery of subroutines. ARL is suitable for learning hierarchies of subroutines and for constructing policies to complex tasks. ARL was successfully tested on a typical rein-forcement learning problem of controlling an agent in a dynamic and nondeterministic environment where the discovered subroutines correspond to agent behaviors.
机译:当面对学习任务中的扩展问题时,控制策略的过程表示具有两个优点。首先,它们是隐式的,在很多种情况下都有归纳概括的潜力。其次,它们有助于模块化。在本文中,我们比较了几种学习模块化过程表示的随机算法。称为学习自适应表示(ARL)的主要算法是一种遗传编程扩展,它依赖于子例程的发现。 ARL适用于学习子例程的层次结构以及构建针对复杂任务的策略。 ARL已针对在发现的子例程与代理行为相对应的动态且不确定的环境中控制代理的典型强化学习问题成功进行了测试。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号