Convergence of reinforcement learning algorithms and acceleration of learning - art. no. 026706

Potapov A.; Ali MK.

首页> 外文期刊>Physical review, E. Statistical physics, plasmas, fluids, and related interdisciplinary topics >Convergence of reinforcement learning algorithms and acceleration of learning - art. no. 026706

【24h】

Convergence of reinforcement learning algorithms and acceleration of learning - art. no. 026706

机译：强化学习算法的融合和学习加速。没有。 026706

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The techniques of reinforcement learning have been gaining increasing popularity recently. However, the question of their convergence rate is still open. We consider the problem of choosing the learning steps alpha(n), and their relation with discount gamma and exploration degree epsilon. Appropriate choices of these parameters may drastically influence the convergence rate of the techniques. From analytical examples, we conjecture optimal values of alpha(n) and then use numerical examples to verify our conjectures. [References: 20]

机译：强化学习技术最近已经越来越流行。但是，它们的收敛速度问题仍然悬而未决。我们考虑选择学习步骤alpha（n）的问题，以及它们与折扣伽玛和探索度epsilon的关系。这些参数的适当选择，可能会大大影响的技术收敛速度。从分析示例中，我们猜出了alpha（n）的最优值，然后使用数值示例来验证我们的猜想。 [参考：20]

著录项

来源
《Physical review, E. Statistical physics, plasmas, fluids, and related interdisciplinary topics》 |2003年第2期|共1页
作者
Potapov A.; Ali MK.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Td(lambda);

机译：Td（λ）;

相似文献

外文文献
中文文献
专利

1. Convergence of reinforcement learning algorithms and acceleration of learning - art. no. 026706 [J] . Potapov A., Ali MK. Physical review, E. Statistical physics, plasmas, fluids, and related interdisciplinary topics . 2003,第2aPta2期

机译：强化学习算法的融合和学习加速。没有。 026706
2. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms [J] . SATINDER SINGH, TOMMI JAAKKOLA, MICHAEL L. LITTMAN Machine Learning . 2000,第3期

机译：单步策略强化学习算法的收敛结果
3. Dynamics of the evolution of learning algorithms by selection - art. no. 041912 [J] . Neirotti JP., Caticha N. Physical review, E. Statistical physics, plasmas, fluids, and related interdisciplinary topics . 2003,第4aPta1期

机译：通过选择学习算法发展的动态-艺术。没有。 041912
4. HAT-DRL: Hotspot-Aware Task Mapping for Lifetime Improvement of Multicore System using Deep Reinforcement Learning**This work is supported in part by NSF grants under No. CCF-1816361, in part by NSF grant under No. CCF-2007135 and No. OISE-1854276. [C] . Jinwei Zhang, Sheriff Sadiqbatcha, Yuanqi Gao, ACM/IEEE Workshop on Machine Learning for CAD . 2020

机译：Hat-DRL：使用深度加强学习的多核系统终身改进的热点感知任务映射**这项工作部分由NSF Grants根据No.CCF-1816361的NSF授予，部分由NSF授予No.CCF-2007135，没有。Oiss-1854276。
5. On the convergence of model -free policy iteration algorithms for reinforcement learning: Stochastic approximation under discontinuous mean dynamics. [D] . Williams, John Kevin. 2000

机译：关于用于增强学习的无模型策略迭代算法的收敛：不连续平均动力学下的随机逼近。
6. Myocardial infarction evaluation from stopping time decision toward interoperable algorithmic states in reinforcement learning [O] . Jong-Rul Park, Sung Phil Chung, Sung Yeon Hwang, 2020

机译：从钢筋学习中停止时间决定的心肌梗死评估
7. Speedy q-learning: a computationally efficient reinforcement learning algorithm with a near optimal rate of convergence [O] . Azar M.G., Munos R., Ghavamzadeh M., 2013

机译：快速q学习：一种计算效率高的强化学习算法，收敛速度接近最佳

Convergence of reinforcement learning algorithms and acceleration of learning - art. no. 026706

摘要

著录项

相似文献

相关主题

期刊订阅