首页> 外文会议>IEEE International Conference on Robotics and Automation >Learning modular neural network policies for multi-task and multi-robot transfer
【24h】

Learning modular neural network policies for multi-task and multi-robot transfer

机译:学习用于多任务和多机器人传输的模块化神经网络策略

获取原文

摘要

Reinforcement learning (RL) can automate a wide variety of robotic skills, but learning each new skill requires considerable real-world data collection and manual representation engineering to design policy classes or features. Using deep reinforcement learning to train general purpose neural network policies alleviates some of the burden of manual representation engineering by using expressive policy classes, but exacerbates the challenge of data collection, since such methods tend to be less efficient than RL with low-dimensional, hand-designed representations. Transfer learning can mitigate this problem by enabling us to transfer information from one skill to another and even from one robot to another. We show that neural network policies can be decomposed into “task-specific” and “robot-specific” modules, where the task-specific modules are shared across robots, and the robot-specific modules are shared across all tasks on that robot. This allows for sharing task information, such as perception, between robots and sharing robot information, such as dynamics and kinematics, between tasks. We exploit this decomposition to train mix-and-match modules that can solve new robot-task combinations that were not seen during training. Using a novel approach to train modular neural networks, we demonstrate the effectiveness of our transfer method for enabling zero-shot generalization with a variety of robots and tasks in simulation for both visual and non-visual tasks.
机译:强化学习(RL)可以自动化各种机器人技能,但学习每项新技能需要相当多的真实数据收集和手动表示工程来设计策略类或功能。使用深度加强学习培训通用目的的神经网络政策通过使用表现力的政策课程减轻了手动表示工程的一些负担,但加剧了数据收集的挑战,因为这些方法往往比具有低维,手的RL效率低于RL - 指定的表示。转移学习可以通过使我们将信息从一个技能转移到另一个技能,甚至从一个机器人转移到另一个技能来缓解这个问题。我们表明,神经网络策略可以分解为“特定于任务特定”和“机器人特定的”模块,其中任务特定的模块在机器人中共享,并且机器人特定的模块在该机器人上的所有任务中共享。这允许在任务之间共享任务信息,例如感知,例如在任务之间共享机器人信息,例如动态和运动学。我们利用这种分解来训练混合和匹配模块,可以解决在培训期间没有看到的新机器人任务组合。使用一种新颖的培训模块化神经网络的方法,我们展示了传递方法的有效性,使零拍摄的常规通过各种机器人和任务在仿真中实现了视觉和非视觉任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号