Sparse Gradient-Based Direct Policy Search

机译：基于稀疏梯度的直接策略搜索

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Reinforcement learning is challenging if state and action spaces are continuous. The discretization of state and action spaces and real-time adaptation of the discretization are critical issues in reinforcement learning problems. In our contribution we consider the adaptive discretization, and introduce a sparse gradient-based direct policy search method. We address the issue of efficient states/actions selection in the gradient-based direct policy search based on imposing sparsity through the L_1 penalty term. We propose to start learning with a fine discretization of state space and to induce sparsity via the L_1 norm. We compare the proposed approach to state-of-the art methods, such as progressive widening Q-learning which updates the discretization of the states adaptively, and to classic as well as sparse Q-learning with linear function approximation. We demonstrate by our experiments on standard reinforcement learning challenges that the proposed approach is efficient.

机译：如果状态和动作空间是连续的，则强化学习将具有挑战性。状态和动作空间的离散化以及离散化的实时适应是强化学习问题中的关键问题。在我们的贡献中，我们考虑了自适应离散化，并介绍了一种基于稀疏梯度的直接策略搜索方法。我们基于通过L_1惩罚项施加稀疏性，解决了基于梯度的直接策略搜索中有效状态/动作选择的问题。我们建议从状态空间的精细离散开始学习，并通过L_1范数诱导稀疏性。我们将提出的方法与最新方法进行了比较，例如渐进式扩展Q学习可自适应更新状态的离散化，以及经典方法和稀疏Q学习（具有线性函数逼近）。我们通过对标准强化学习挑战的实验证明了所提出的方法是有效的。

著录项

来源
《International conference on neural information processing》|2012年|212-221|共10页
会议地点
作者
Nataliya Sokolovska;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Direct policy search; Q-learning; model selection;

机译：直接政策搜索; Q学习选型;

相似文献

外文文献
中文文献
专利

1. Optimized look-ahead tree policies: a bridge between look-ahead tree policies and direct policy search [J] . Tobias Jung, Louis Wehenkel, Damien Ernst, International Journal of Adaptive Control and Signal Processing . 2014,第3a5期

机译：优化的前瞻性树策略：前瞻性树策略和直接策略搜索之间的桥梁
2. Multi-objective Model-based Policy Search for Data-efficient Learning with Sparse Rewards [J] . Rituraj Kaushik, Konstantinos Chatzilygeroudis, Jean-Baptiste Mouret JMLR: Workshop and Conference Proceedings . 2018,第1期

机译：基于多目标模型的策略搜索以稀疏奖励实现数据有效学习
3. Directed Policy Search for Decision Making Using Relevance Vector Machines [J] . Ioannis Rexakis, Michail G. Lagoudakis International Journal of Artificial Intelligence Tools: Architectures, Languages, Algorithms . 2014,第4期

机译：相关向量机决策的定向策略搜索
4. Sparse Gradient-Based Direct Policy Search [C] . Nataliya Sokolovska International Conference on Neural Information Processing . 2012

机译：基于稀疏的梯度直接策略搜索
5. An Adaptive Approach to Optimal Sparse Mobile-Target Search Planning Using Heterogeneous Agents [D] . Kashino, Zendai. 2020

机译：使用异构代理的最佳稀疏移动目标搜索计划的自适应方法
6. Hidden Policy Attribute-Based Data Sharing with Direct Revocation and Keyword Search in Cloud Computing [O] . Axin Wu, Dong Zheng, Yinghui Zhang, 2018

机译：云计算中具有基于直接策略和关键字搜索的基于隐藏策略属性的数据共享
7. Gradient-based Reinforcement Planning in Policy-Search Methods [O] . Kwee, Ivo, Hutter, Marcus, Schmidhuber, Juergen 2001

机译：政策检索方法中基于梯度的强化规划
8. Gradient-Based Adaptive Stochastic Search for Non-Differentiable Optimization. [R] . Zhou, E., Hu, J. 2012

机译：基于梯度的自适应随机搜索非可微优化问题。

Sparse Gradient-Based Direct Policy Search

摘要

著录项

相似文献

相关主题

期刊订阅