首页> 外文期刊>IEEE transactions on systems, man, and cybernetics. Part B >Hidden state and reinforcement learning with instance-based state identification
【24h】

Hidden state and reinforcement learning with instance-based state identification

机译:隐藏状态和强化学习以及基于实例的状态识别

获取原文
获取原文并翻译 | 示例

摘要

Real robots with real sensors are not omniscient. When a robot's next course of action depends on information that is hidden from the sensors because of problems such as occlusion, restricted range, bounded field of view and limited attention, we say the robot suffers from the hidden state problem. State identification techniques use history information to uncover hidden state. Some previous approaches to encoding history include: finite state machines, recurrent neural networks and genetic programming with indexed memory. A chief disadvantage of all these techniques is their long training time. This paper presents instance-based state identification, a new approach to reinforcement learning with state identification that learns with much fewer training steps. Noting that learning with history and learning in continuous spaces both share the property that they begin without knowing the granularity of the state space, the approach applies instance-based (or "memory-based") learning to history sequences-instead of recording instances in a continuous geometrical space, we record instances in action-percept-reward sequence space. The first implementation of this approach, called Nearest Sequence Memory, learns with an order of magnitude fewer steps than several previous approaches.
机译:具有真实传感器的真实机器人并非无所不知。当机器人的下一个动作路线依赖于由于诸如遮挡,范围受限,视野受限和注意力有限之类的问题而从传感器隐藏的信息时,我们说机器人会遭受隐藏状态问题的困扰。状态识别技术使用历史信息来发现隐藏状态。以前的历史编码方法包括:有限状态机,递归神经网络和带有索引内存的遗传编程。所有这些技术的主要缺点是训练时间长。本文介绍了基于实例的状态识别,这是一种通过状态识别进行强化学习的新方法,该方法只需很少的训练步骤即可学习。注意到使用历史学习和在连续空间中学习都共享了它们的开始的属性,而无需了解状态空间的粒度,该方法将基于实例(或“基于内存”)的学习应用于历史序列,而不是在实例中记录实例一个连续的几何空间,我们在行动-感知-奖励序列空间中记录实例。这种方法的第一个实现称为“最近序列存储器”,其学习步骤比以前的几种方法少一个数量级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号