Optimality-Based Analysis of XCSF Compaction in Discrete Reinforcement Learning

机译：离散强化学习中基于最优性的XCSF压实分析

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Learning classifier systems (LCSs) are population-based predictive systems that were originally envisioned as agents to act in reinforcement learning (RL) environments. These systems can suffer from population bloat and so are amenable to compaction techniques that try to strike a balance between population size and performance. A well-studied LCS architecture is XCSF, which in the RL setting acts as a Q-function approximator. We apply XCSF to a deterministic and stochastic variant of the FrozenLake8x8 environment from OpenAI Gym, with its performance compared in terms of function approximation error and policy accuracy to the optimal Q-functions and policies produced by solving the environments via dynamic programming. We then introduce a novel compaction algorithm (Greedy Niche Mass Compaction-GNMC) and study its operation on XCSF's trained populations. Results show that given a suitable parametrisation, GNMC preserves or even slightly improves function approximation error while yielding a significant reduction in population size. Reasonable preservation of policy accuracy also occurs, and we link this metric to the commonly used steps-to-goal metric in maze-like environments, illustrating how the metrics are complementary rather than competitive.

机译：学习分类器系统（LCS）是基于人口的预测系统，最初被设想为在强化学习（RL）环境中起作用的主体。这些系统可能会遭受人口膨胀的困扰，因此适合尝试在人口规模和绩效之间寻求平衡的压实技术。一个经过充分研究的LCS架构是XCSF，它在RL设置中充当Q函数逼近器。我们将XCSF应用于OpenAI Gym的FrozenLake8x8环境的确定性和随机变体，其性能在函数逼近误差和策略准确性方面与通过动态编程解决环境所产生的最佳Q函数和策略进行了比较。然后，我们介绍一种新颖的压缩算法（Greedy Niche Mass Compaction-GNMC），并研究它在XCSF受过训练的人群上的运行情况。结果表明，给定合适的参数设置，GNMC可以保留或什至略微改善函数逼近误差，同时可以显着减少总体规模。还可以合理地保留策略的准确性，并且我们将此度量标准与类似迷宫的环境中常用的“逐步实现目标”度量标准进行了链接，说明了这些度量标准是互补的而不是竞争性的。

著录项

来源
《International conference on parallel problem solving from nature》|2020年|471-484|共14页
会议地点
作者
Jordan T. Bishop; Marcus Gallagher;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Reinforcement learning; Learning classifier system; XSCF; Compaction;

机译：强化学习;学习分类系统; XSCF;压实;

相似文献

外文文献
中文文献
专利

1. Discrete and fuzzy dynamical genetic programming in the XCSF learning classifier system [J] . Richard J. Preen, Larry Bull Soft computing: A fusion of foundations, methodologies and applications . 2014,第1期

机译：XCSF学习分类器系统中的离散和模糊动态遗传规划
2. Traffic signal optimization through discrete and continuous reinforcement learning with robustness analysis in downtown Tehran [J] . Mohammad Aslani, Stefan Seipel, Mohammad Saadi Mesgari, Advanced engineering informatics . 2018,第OCTa期

机译：通过离散和连续强化学习以及鲁棒性分析在德黑兰市中心实现交通信号优化
3. Q-learning solution for optimal consensus control of discrete-time multiagent systems using reinforcement learning [J] . Mu Chaoxu, Zhao Qian, Gao Zhongke, Journal of the Franklin Institute . 2019,第13期

机译：使用强化学习的离散多主体系统最优共识控制的Q学习解决方案
4. Reinforcement Learning of a Morphing Airfoil- Policy and Discrete Learning Analysis [C] . A. Lampton, A. Niksch, J. Valasek AIAA guidance, navigation and control conference and exhibit . 2008

机译：变形翼型的强化学习-策略和离散学习分析
5. Utilizing a Discrete Event Simulation of Material Handling Plans to Calculate Reinforcement Learning Rewards. [D] . Preston, Ryan A. 2017

机译：利用物料处理计划的离散事件模拟来计算强化学习奖励。
6. How much of reinforcement learning is working memory not reinforcement learning? A behavioral computational and neurogenetic analysis [O] . Anne G. E. Collins, Michael J. Frank -1

机译：钢筋学习多少是工作记忆而不是加强学习？行为计算和神经肝分析
7. Optimality-Based Analysis of XCSF Compaction in Discrete Reinforcement Learning [O] . Jordan T. Bishop, Marcus Gallagher 2020

机译：离散加固学习中XCSF压实的最优基础分析

Optimality-Based Analysis of XCSF Compaction in Discrete Reinforcement Learning

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅