首页> 外文期刊>Artificial life and robotics >Reduction of state space in reinforcement learning by sensor selection
【24h】

Reduction of state space in reinforcement learning by sensor selection

机译:通过传感器选择减少强化学习中的状态空间

获取原文
获取原文并翻译 | 示例
       

摘要

Much research has been conducted on the application of reinforcement learning to robots. Learning time is a matter of concern in reinforcement learning. In reinforcement learning, information from sensors is projected on to a state space. A robot learns the correspondence between each state and action in state space and determines the best correspondence. When the state space is expanded according to the number of sensors, the number of correspondences learnt by the robot is increased. Therefore, learning the best correspondence becomes time consuming. In this study, we focus on the importance of sensors for a robot to perform a particular task. The sensors that are applicable to a task differ for different tasks. A robot does not need to use all installed sensors to perform a task. The state space should consist of only those sensors that are essential to a task. Using such a state space consisting of only important sensors, a robot can learn correspondences faster than in the case of a state space consisting of all installed sensors. Therefore, in this paper, we propose a relatively fast learning system in which a robot can autonomously select those sensors that are essential to a task and a state space for only such important sensors is constructed. We define the measure of importance of a sensor for a task. The measure is the coefficient of correlation between the value of each sensor and reward in reinforcement learning. A robot determines the importance of sensors based on this correlation. Consequently, the state space is reduced based on the importance of sensors. Thus, the robot can efficiently learn correspondences owing to the reduced state space. We confirm the effectiveness of our proposed system through a simulation.
机译:关于强化学习在机器人上的应用已经进行了很多研究。学习时间是强化学习中需要关注的问题。在强化学习中,来自传感器的信息被投影到状态空间上。机器人学习状态空间中每个状态与动作之间的对应关系,并确定最佳对应关系。当根据传感器的数量扩展状态空间时,机器人学习到的对应关系的数量会增加。因此,学习最佳的对应关系变得很耗时。在这项研究中,我们专注于传感器对于机器人执行特定任务的重要性。适用于某项任务的传感器因不同任务而异。机器人不需要使用所有已安装的传感器来执行任务。状态空间应仅由对任务至关重要的那些传感器组成。与仅由所有安装的传感器组成的状态空间相比,使用仅由重要传感器组成的状态空间,机器人可以更快地学习对应关系。因此,在本文中,我们提出了一个相对较快的学习系统,在该系统中,机器人可以自主选择那些对于任务至关重要的传感器,并且仅为这些重要的传感器构建状态空间。我们定义传感器对任务的重要性的度量。度量是强化学习中每个传感器的值与奖励之间的相关系数。机器人根据此相关性确定传感器的重要性。因此,基于传感器的重要性减小了状态空间。因此,由于状态空间的减少,机器人可以有效地学习对应关系。我们通过仿真确认了我们提出的系统的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号