首页> 外文会议>International Conference on Inductive Logic Programming >A Direct Policy-Search Algorithm for Relational Reinforcement Learning
【24h】

A Direct Policy-Search Algorithm for Relational Reinforcement Learning

机译:一种直接策略研究的关系强化学习算法

获取原文

摘要

In the field of relational reinforcement learning - a representational generalisation of reinforcement learning - the first-order representation of environments results in a potentially infinite number of possible states, requiring learning agents to use some form of abstraction to learn effectively. Instead of forming an abstraction over the state-action space, an alternative technique is to create behaviour directly through policy-search. The algorithm named CERRLA presented in this paper uses the cross-entropy method to learn behaviour directly in the form of decision-lists of relation rules for solving problems in a range of different environments, without the need for expert guidance in the learning process. The behaviour produced by the algorithm is easy to comprehend and is biased towards compactness. The results obtained show that CERRLA is competitive in both the standard testing environment and in Ms. PAC-MAN and CARCASSONNE, two large and complex game environments.
机译:在关系强度学习领域 - 增强学习的代表性概括 - 环境的一阶表示导致可能的无限数量的可能状态,需要学习代理使用某种形式的抽象来有效地学习。不是通过状态动作空间形成抽象,而是一种替代技术是通过策略搜索直接创建行为。本文提出的算法名为Cerrla使用跨熵方法直接以决策列表的形式学习行为,以解决一系列不同环境中的问题,而无需专家指导。算法产生的行为易于理解,并偏向紧凑。得到的结果表明,Cerrla在标准测试环境和Pac-Man和Carcassonne女士,两个大型和复杂的游戏环境中具有竞争力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号