首页> 外文会议>13th European Conference on Machine Learning, Aug 19-23, 2002, Helsinki, Finland >Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning
【24h】

Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning

机译:Q-Cut-强化学习中子目标的动态发现

获取原文
获取原文并翻译 | 示例

摘要

We present the Q-Cut algorithm, a graph theoretic approach for automatic detection of sub-goals in a dynamic environment, which is used for acceleration of the Q-Learning algorithm. The learning agent creates an on-line map of the process history, and uses an efficient Max-Flow/Min-Cut algorithm for identifying bottlenecks. The policies for reaching bottlenecks are separately learned and added to the model in a form of options (macro-actions). We then extend the basic Q-Cut algorithm to the Segmented Q-Cut algorithm, which uses previously identified bottlenecks for state space partitioning, necessary for finding additional bottlenecks in complex environments. Experiments show significant performance improvements, particulary in the initial learning phase.
机译:我们提出了Q-Cut算法,这是一种在动态环境中自动检测子目标的图形理论方法,用于加速Q-Learning算法。学习代理创建过程历史记录的在线地图,并使用有效的最大流/最小剪切算法来识别瓶颈。达到瓶颈的策略是单独学习的,并以选项(宏动作)的形式添加到模型中。然后,我们将基本的Q-Cut算法扩展到分段Q-Cut算法,该算法使用先前确定的瓶颈进行状态空间分区,这对于在复杂环境中查找其他瓶颈是必不可少的。实验显示出显着的性能改进,尤其是在初始学习阶段。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号