Uncertainty Propagation for Efficient Exploration in Reinforcement Learning

机译：加固学习高效探索的不确定性繁殖

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Reinforcement learning aims to derive an optimal policy for an often initially unknown environment. In the case of an unknown environment, exploration is used to acquire knowledge about it. In that context the well-known exploration-exploitation dilemma arises-when should one stop to explore and instead exploit the knowledge already gathered? In this paper we propose an uncertainty-based exploration method. We use uncertainty propagation to obtain the Q-function's uncertainty and then use the uncertainty in combination with the Q-values to guide the exploration to promising states that so far have been insufficiently explored. The uncertainty's weight during action selection can be influenced by a parameter. We evaluate one variant of the algorithm using full covariance matrices and two variants using an approximation and demonstrate their functionality on two benchmark problems.

机译：强化学习旨在为通常最初未知的环境推导出最佳政策。在未知环境的情况下，探索用于获得关于它的知识。在那种情况下，众所周知的探索剥削困境 - 当应该停止探索并剥削已经收集的知识时应该探索？在本文中，我们提出了一种基于不确定性的探索方法。我们使用不确定性传播来获得Q函数的不确定性，然后使用不确定性与Q值结合使用，以指导探索到目前为止探索的承诺状态。行动选择期间的不确定性的重量可能会受到参数的影响。我们使用近似使用完整的协方差矩阵和两个变体评估算法的一个变体，并在两个基准问题上展示其功能。

著录项

来源
《European Conference on Artificial Intelligence》|2010年||共6页
会议地点
作者
Alexander Hans; Steffen Udluft;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词
入库时间 2022-08-20 19:51:53

相似文献

外文文献
中文文献
专利

1. Efficient exploration through active learning for value function approximation in reinforcement learning. [J] . Akiyama T, Hachiya H, Sugiyama M Neural Networks: The Official Journal of the International Neural Network Society . 2010,第5期

机译：通过主动学习对强化学习中的价值函数近似进行有效探索。
2. Deficits in positive reinforcement learning and uncertainty-driven exploration are associated with distinct aspects of negative symptoms in schizophrenia. [J] . Strauss GP, Frank MJ, Waltz JA, Biological psychiatry . 2011,第5期

机译：积极强化学习和不确定性驱动探索的不足与精神分裂症的负面症状的不同方面有关。
3. PP-PG: Combining Parameter Perturbation with Policy Gradient Methods for Effective and Efficient Explorations in Deep Reinforcement Learning [J] . Li Shilei, Li Meng, Su Jiongming, ACM transactions on intelligent systems and technology . 2021,第3期

机译：PP-PG：将参数扰动与政策梯度方法相结合，为深加固学习中有效和高效的探索
4. Uncertainty Propagation for Efficient Exploration in Reinforcement Learning [C] . Alexander Hans, Steffen Udluft European Conference on Artificial Intelligence . 2010

机译：加固学习高效探索的不确定性繁殖
5. Sample-Efficient Nonconvex Optimization Algorithms in Machine Learning and Reinforcement Learning [D] . Xu, Pan. 2021

机译：机器学习和加固学习中的采样高效的非透露算法
6. Deficits in Positive Reinforcement Learning and Uncertainty-Driven Exploration are Associated with Distinct Aspects of Negative Symptoms in Schizophrenia [O] . Gregory P. Strauss, Michael J. Frank, James A. Waltz, -1

机译：积极加强学习和不确定性驱动的探索的缺陷与精神分裂症中的阴性症状的独特方面有关
7. Efficient Uncertainty Propagation for Reinforcement Learning with Limited Data [O] . Er Hans, Steffen Udluft 2010

机译：有限数据的强化学习的有效不确定性传播

Uncertainty Propagation for Efficient Exploration in Reinforcement Learning

摘要

著录项

相似文献

相关主题

期刊订阅