【24h】

Anticipatory Learning Classifier Systems and Factored Reinforcement Learning

机译:预期学习分类器系统和强化学习

获取原文
获取原文并翻译 | 示例

摘要

Factored Reinforcement Learning (FRL) is a new technique to solve Factored Markov Decision Problems (FMDPs) when the structure of the problem is not known in advance. Like Anticipatory Learning Classifier Systems (LACSs), it is a model-based Reinforcement Learning approach that includes generalization mechanisms in the presence of a structured domain. In general, FRL and ALCSs are explicit, state-anticipatory approaches that learn generalized state transition models to improve system behavior based on model-based reinforcement learning techniques. In this contribution, we highlight the conceptual similarities and differences between FRL and ALCSs, focusing on the one hand on SPITI, an instance of frl method, and on ALCSs, MACS and XACS, on the other hand. Though FRL systems seem to benefit from a clearer theoretical grounding, an empirical comparison between SPITI and XACS on two benchmark problems reveals that the latter scales much better than the former when some combination of state variables do not occur. Based on this finding, we discuss the mechanisms in XACS that result in the better scalability and propose importing these mechanisms into FRL systems.
机译:分解式强化学习(FRL)是一种新的技术,用于在问题的结构未知时解决分解式马尔可夫决策问题(FMDP)。与预期学习分类器系统(LACS)一样,它是一种基于模型的强化学习方法,其中包括存在结构化域时的泛化机制。通常,FRL和ALCS是显式的状态预期方法,可基于基于模型的强化学习技术来学习广义状态转换模型以改善系统行为。在本文中,我们着重强调了FRL和ALCS之间的概念异同,一方面着重于SPITI(frl方法的一个实例),另一方面着重于ALCS,MACS和XACS。尽管FRL系统似乎受益于更清晰的理论基础,但是SPITI和XACS在两个基准问题上的经验比较显示,当状态变量不发生某种组合时,后者的伸缩性要比前者好得多。基于此发现,我们讨论了XACS中可带来更好可伸缩性的机制,并建议将这些机制导入FRL系统。

著录项

  • 来源
  • 会议地点 Munich(DE);Munich(DE);Munich(DE)
  • 作者单位

    Universite Pierre et Marie Curie - Paris6 Institut des Systemes Intelligents et de Robotique (ISIR), CNRS UMR 7222, 4 place Jussieu, F-75005 Paris, France;

    University of Wuerzburg Roentgenring 11 97070 Wuerzburg, Germany;

    rnUniversite Pierre et Marie Curie - Paris6 Institut des Systemes Intelligents et de Robotique (ISIR), CNRS UMR 7222, 4 place Jussieu, F-75005 Paris, France Thales Security Solutions Services, Simulation 1 rue du General de Gaulle, Osny BP 226 F95523 Cergy Pontoise Cedex, France;

    rnThales Security Solutions Services, ThereSIS Research and Innovation Office Route departementale 128 F91767 Palaiseau Cedex, France;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 人工智能理论 ;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号