Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning

机译：Q-Cut-强化学习中子目标的动态发现

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present the Q-Cut algorithm, a graph theoretic approach for automatic detection of sub-goals in a dynamic environment, which is used for acceleration of the Q-Learning algorithm. The learning agent creates an on-line map of the process history, and uses an efficient Max-Flow/Min-Cut algorithm for identifying bottlenecks. The policies for reaching bottlenecks are separately learned and added to the model in a form of options (macro-actions). We then extend the basic Q-Cut algorithm to the Segmented Q-Cut algorithm, which uses previously identified bottlenecks for state space partitioning, necessary for finding additional bottlenecks in complex environments. Experiments show significant performance improvements, particulary in the initial learning phase.

机译：我们提出了Q-Cut算法，这是一种在动态环境中自动检测子目标的图形理论方法，用于加速Q-Learning算法。学习代理创建过程历史记录的在线地图，并使用有效的最大流/最小剪切算法来识别瓶颈。达到瓶颈的策略是单独学习的，并以选项（宏动作）的形式添加到模型中。然后，我们将基本的Q-Cut算法扩展到分段Q-Cut算法，该算法使用先前确定的瓶颈进行状态空间分区，这对于在复杂环境中查找其他瓶颈是必不可少的。实验显示出显着的性能改进，尤其是在初始学习阶段。

著录项

来源
《13th European Conference on Machine Learning, Aug 19-23, 2002, Helsinki, Finland》|2002年|p.295-306|共12页
会议地点 Helsinki(FI);Helsinki(FI)
作者
Ishai Menache; Shie Mannor; Nahum Shimkin;
展开▼
作者单位

Department of Electrical Engineering, Technion, Israel Institute of Technology Haifa 32000, Israel;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Hierarchical Reinforcement Learning With Automatic Sub-Goal Identification [J] . Chenghao Liu, Fei Zhu, Quan Liu, 自动化学报（英文版） . 2021,第010期

机译：Hierarchical Reinforcement Learning With Automatic Sub-Goal Identification
2. Hierarchical Reinforcement Learning With Automatic Sub-Goal Identification [J] . Chenghao Liu, Fei Zhu, Quan Liu, 自动化学报：英文版 . 2021,第010期

机译：Hierarchical Reinforcement Learning With Automatic Sub-Goal Identification
3. A Reinforcement Learning Method Using a Dynamic Reinforcement Function Based on Action Selection Probability [J] . Yugo Hasegawa, Satoko Takada, Hidehiro Nakano, Systems and Computers in Japan . 2007,第7期

机译：基于动作选择概率的动态强化函数强化学习方法
4. Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning [C] . Ishai Menache, Shie Mannor, Nahum Shimkin European Conference on Machine Learning . 2002

机译：Q-Cut - 强化学习中子目标的动态发现
5. Reinforcement Learning and Recurrent Reinforcement Learning for Dynamic Portfolio Optimization [D] . Almahdi, Saud 2019

机译：强化学习和循环强化学习以实现动态资产组合优化
6. A multiplicative reinforcement learning model capturing learning dynamics and interindividual variability in mice [O] . Brice Bathellier, Sui Poh Tee, Christina Hrovat, 2013

机译：捕获小鼠学习动态和个体差异的乘法强化学习模型
7. Q-Cut - Dynamic Discovery of Sub-Goals in Reinforcement Learning [O] . Ishai Menache, Shie Mannor, Nahum Shimkin 2002

机译：Q-Cut - 强化学习中子目标的动态发现

Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning

摘要

著录项

相似文献

相关主题

期刊订阅