首页> 外文学位 >Subgoal discovery for hierarchical reinforcement learning using learned policies.

【24h】

Subgoal discovery for hierarchical reinforcement learning using learned policies.

机译：使用目标策略进行分层强化学习的子目标发现。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Reinforcement learning has proven to be an effective method for creating intelligent agents in a wide range of applications. However, it suffers from the need for a large number of training episodes, a problem that is especially noticeable in large domains. Although the utility of hierarchy is commonly accepted, there has been relatively little research on autonomously discovering or creating useful hierarchies. A system is desirable that can scale reinforcement learning to complex real-world tasks and autonomously discover hierarchical structures within their learning and control systems.; This thesis introduces a method that allows a reinforcement learning agent to autonomously discover and create hierarchy from a learned policy model. A hierarchy of actions helps to create an abstraction which is an encapsulation of a set of actions into a single higher level action that allows an agent to learn while ignoring details that appear at finer levels. The main idea is to find subgoals in a learned policy model by searching for states that exhibit certain structural properties. These subgoals are used to create hierarchies of actions. The hierarchies of actions help the agent to explore more effectively and accelerate learning in other tasks in the same or similar environments where the same subgoals are useful. It is demonstrated that the hierarchical action sequences created with autonomously discovered subgoals can facilitate learning and enable effective knowledge transfer to related tasks.

机译：事实证明，强化学习是在各种应用中创建智能代理的有效方法。但是，它需要大量的训练集，这是一个在大型领域中尤其明显的问题。尽管层次结构的效用已被普遍接受，但是关于自主发现或创建有用的层次结构的研究相对较少。期望有一种系统，其可以将强化学习扩展到复杂的现实世界任务，并在其学习和控制系统内自主发现层次结构。本文介绍了一种方法，该方法允许强化学习代理从学习的策略模型中自主发现并创建层次结构。动作的层次结构有助于创建抽象，该抽象是将一组动作封装到单个更高级别的动作中，该动作使代理能够在学习的同时忽略出现在更精细级别的细节。主要思想是通过搜索表现出某些结构属性的状态来在学习的策略模型中找到子目标。这些子目标用于创建操作的层次结构。动作层次结构可帮助代理更有效地探索并加速在使用相同子目标的相同或相似环境中的其他任务中的学习。结果表明，使用自主发现的子目标创建的分层操作序列可以促进学习，并使有效的知识转移到相关任务。

著录项

作者
Goel, Sandeep Kumar.;
展开▼
作者单位

The University of Texas at Arlington.;

展开▼
授予单位 The University of Texas at Arlington.;
学科 Computer Science.
学位 M.S.
年度 2003
页码 51 p.
总页数 51
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Autonomic discovery of subgoals in hierarchical reinforcement learning [J] . XIAO Ding, LI Yi-tong, SHI Chuan 中国邮电高校学报（英文版） . 2014,第005期

机译：分层强化学习中子目标的自主发现
2. Anchor: The achieved goal to replace the subgoal for hierarchical reinforcement learning [J] . Li Ruijia, Cai Zhiling, Huang Tianyi, Knowledge-Based Systems . 2021,第Auga5期

机译：锚定：取代亚古通的达到分层加强学习的目标
3. Reinforcement learning transfer based on subgoal discovery and subtask similarity [J] . Wang Hao, Fan Shunguo, Song Jinhua, Automatica Sinica, IEEE/CAA Journal of . 2014,第3期

机译：基于子目标发现和子任务相似性的强化学习转移
4. Subgoal Discovery for Hierarchical Reinforcement Learning Using Learned Policies [C] . Sandeep Goel, Manfred Huber International Florida Artiticial Intelligence Research Society Conference and International Flairs Conference: Recent Advances in Artificial Intelligece; 2003 . 2003

机译：使用学习到的策略进行分层强化学习的子目标发现
5. Learning state and action space hierarchies for reinforcement learning using action -dependent partitioning. [D] . Asadi, Mehran. 2006

机译：使用依赖于动作的分区来学习状态和动作空间层次结构，以进行强化学习。
6. Towards sentiment aided dialogue policy learning for multi-intent conversations using hierarchical reinforcement learning [O] . Tulika Saha, Sriparna Saha, Pushpak Bhattacharyya 2020

机译：利用等级强化学习的多意图对话的情感对话策略学习
7. Subgoal Discovery for Hierarchical Dialogue Policy Learning [O] . Da Tang, Xiujun Li, Jianfeng Gao, 2018

机译：分层对话政策学习的子纳瓦发现

Subgoal discovery for hierarchical reinforcement learning using learned policies.

摘要

著录项

相似文献

相关主题

期刊订阅