首页> 外文会议>International conference on artificial neural networks;ICANN 98 >Three Principles of Hierarchical Task Composition in Reinforcement Learning
【24h】

Three Principles of Hierarchical Task Composition in Reinforcement Learning

机译:强化学习中分层任务组合的三项原则

获取原文

摘要

We present three principles of hierarchical task composition within a single agent using reinforcement learning to solve continuous control problems. We consider complex tasks having goals defined as conjunctions of subgoals, each learned by a separate task with Q-learning. However, subgoals may depend on each other, requiring particular task composition principles. In the first principle, the Q-function of some task gets underlaid with the Q-function of an avoidance task, resulting in a composition in which the latter may put a veto on an action of the former. The second principle uses explicit task activation as a hierarchical relation between two tasks. Subtask activation lasts just one time-step the length of which is adapted to the particular subtask's state-space discretization. In the third principle, two tasks are related to each other such that the hierarchically higher one perturbs the goal state of the lower one in the direction of its own goal. These principles define interaction in a multi-layer architecture, with sequential task composition within each layer, and with each maintaining the system in an equilibrium condition. The approach is demonstrated with the task in which a truck navigates backwards to a docking point.
机译:我们介绍了使用增强学习来解决连续控制问题的单个代理中的分层任务组合的三个原理。我们考虑将目标定义为子目标的并集的复杂任务,每个目标都是通过具有Q学习的单独任务来学习的。但是,子目标可能彼此依赖,需要特定的任务组成原则。在第一个原则中,某些任务的Q功能与回避任务的Q功能相叠加,从而导致组合可能使前者的行动遭到否决。第二个原则使用显式任务激活作为两个任务之间的层次关系。子任务激活仅持续一个时间步,其时间长度适合于特定子任务的状态空间离散化。在第三个原则中,两个任务相互关联,从而使层次较高的任务朝着其自身目标的方向干扰较低任务的目标状态。这些原理定义了多层体系结构中的交互作用,每一层中都有顺序的任务组合,并且每一层都将系统保持在平衡状态。该方法通过卡车向后导航到停靠点的任务得到了证明。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号