首页> 外文会议>American Association for Artificial Intelligence Symposium >Automatic Development from Pixel-level Representation to Action-level Representation in Robot Navigation
【24h】

Automatic Development from Pixel-level Representation to Action-level Representation in Robot Navigation

机译:从像素级表示自动开发到机器人导航中的动作级表示

获取原文

摘要

Many important real-world robotic tasks have high diameter, that is, their solution requires a large number of primitive actions by the robot. For example, they may require navigating to distant locations using primitive motor control commands. In addition, modern robots are endowed with rich, high-dimensional sensory systems, providing measurements of a continuous environment. Reinforcement learning (RL) has shown promise as a method for automatic learning of robot behavior, but current methods work best on low-diameter, low-dimensional tasks. Because of this problem, the success of RL on real-world tasks still depends on human analysis of the robot, environment, and task to provide a useful sensorimotor representation to the learning agent. A new method, Self-Organizing Distinctive-state Abstraction (SODA) Provost, Kuipers, & Miikkulainen (2006); Provost (2007) solves this problem, by bootstrapping the robot's representation from the pixel-level of raw sensor input and motor control signals to a higher action-level consisting of distinctive states and extended actions that move the robot between these states. These new states and actions move the robot through its environment in large steps, allowing it to learn to navigate much more easily and quickly than it would using its primitive actions and sensations. SODA requires no hand-coded features or other prior knowledge of the robot's sensorimotor system or environment, and learns an abstraction that is suitable for supporting multiple tasks in an environment. Given a robot with high-dimensional, continuous sensations, continuous actions, and one or more reinforcement signals for high-diameter tasks, the agent's learning process consists of the following steps.
机译:许多重要的实际机器人任务具有高直径,即,它们的解决方案需要机器人大量的原始动作。例如,它们可能需要使用原始电机控制命令导航到远处位置。此外,现代机器人赋予丰富的高维感官系统,提供连续环境的测量。强化学习(RL)已显示承诺作为自动学习机器人行为的方法,但目前的方法在低直径,低维任务上工作。由于这个问题,RL对现实世界任务的成功仍然取决于对机器人,环境和任务的人工分析,为学习代理提供有用的感官电流器表示。一种新的方法,自组织独特 - 国家抽象(苏打水)Provost,Kuipers,&Miikkulainen(2006); Provost(2007)解决了这个问题,通过从原始传感器输入和电机控制信号的像素级别从原始传感器输入和电机控制信号引导到更高的动作级别,该级别由独特的状态和移动在这些状态之间移动机器人的扩展动作。这些新的州和行动将机器人通过其环境移动到大步,允许它学会比使用它的原始动作和感觉更容易且更快速地导航。 SODA不需要机器人的Sensimotor系统或环境的手工编码功能或其他先验知识,并学习适合于在环境中支持多个任务的抽象。给出具有高维,连续感性,连续动作的机器人,以及用于高直径任务的一个或多个加强信号,代理的学习过程包括以下步骤。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号