The measurement of strategy convergence for reinforcement learning in discrete state space

机译：离散状态空间钢筋学习策略融合的测量

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The concept of entropy is introduced into reinforcement learning. The definitions of the local strategy entropy and global strategy entropy are proposed. The global strategy entropy is proved to be the quantitative problem-independent measurement of the learning progress, i.e. the convergence degree of the strategy. To improve the learning performance, reinforcement learning with self-adaptive learning rate is proposed based on the strategy entropy. The experimental results show that learning based on the local strategy entropy has better learning performance than those with fixed learning rates.

机译：熵的概念被引入加强学习。提出了地方战略熵和全球战略熵的定义。被证明，全球战略熵是对学习进度的定量问题无关的测量，即战略的收敛程度。为了提高学习性能，基于战略熵提出了利用自适应学习率的加固学习。实验结果表明，基于本地战略熵的学习比具有固定学习率的学习性能更好。

著录项

来源
《IEEE International Conference on Computer Science and Automation Engineering》|2012年||共7页
会议地点
作者
Gao Yanming; Yin Jie; Wang Bo; Qu Peng; Zhou Ling;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-53;
关键词

相似文献

外文文献
中文文献
专利

1. Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces [J] . Sean R. Sinclair, Siddhartha Banerjee, Christina Lee Yu Performance evaluation review . 2020,第1期

机译：公原空间中的焦化加固学习的自适应离散化
2. Ubiquitous Distributed Deep Reinforcement Learning at the Edge: Analyzing Byzantine Agents in Discrete Action Spaces [J] . Wenshuai Zhao, Jorge Pe?a Queralta, Li Qingqing, Procedia Computer Science . 2020,第5期

机译：边缘无处不在的分布式深度增强学习：在离散动作空间中分析拜占庭工
3. Quantum-Enhanced Reinforcement Learning for Finite-Episode Games with Discrete State Spaces [J] . Neukart Florian, Von Dollen David, Seidel Christian, Frontiers in Physics . 2017,第9期

机译：具有离散状态空间的有限事件游戏的量子增强强化学习
4. The measurement of strategy convergence for reinforcement learning in discrete state space [C] . Gao Yanming, Yin Jie, Wang Bo, CSAE2012;IEEE international conference on computer science and automation engineering . 2012

机译：离散状态空间中强化学习策略收敛性的度量
5. On the convergence of model -free policy iteration algorithms for reinforcement learning: Stochastic approximation under discontinuous mean dynamics. [D] . Williams, John Kevin. 2000

机译：关于用于增强学习的无模型策略迭代算法的收敛：不连续平均动力学下的随机逼近。
6. Action-specialized expert ensemble trading system with extended discrete action space using deep reinforcement learning [O] . JoonBum Leem, Ha Young Kim, Baogui Xin, 2020

机译：采用深度加固学习采用延长离散动作空间的行动专业专业专家集合交易系统
7. Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces [O] . Sean R. Sinclair, Siddhartha Banerjee, Christina Lee Yu 2020

机译：公原空间中的焦化加固学习的自适应离散化

The measurement of strategy convergence for reinforcement learning in discrete state space

摘要

著录项

相似文献

相关主题

期刊订阅