首页> 外文会议>IEEE International Conference on Robotics and Automation >Learning modular neural network policies for multi-task and multi-robot transfer
【24h】

Learning modular neural network policies for multi-task and multi-robot transfer

机译:学习用于多任务和多机器人传输的模块化神经网络策略

获取原文

摘要

Reinforcement learning (RL) can automate a wide variety of robotic skills, but learning each new skill requires considerable real-world data collection and manual representation engineering to design policy classes or features. Using deep reinforcement learning to train general purpose neural network policies alleviates some of the burden of manual representation engineering by using expressive policy classes, but exacerbates the challenge of data collection, since such methods tend to be less efficient than RL with low-dimensional, hand-designed representations. Transfer learning can mitigate this problem by enabling us to transfer information from one skill to another and even from one robot to another. We show that neural network policies can be decomposed into “task-specific” and “robot-specific” modules, where the task-specific modules are shared across robots, and the robot-specific modules are shared across all tasks on that robot. This allows for sharing task information, such as perception, between robots and sharing robot information, such as dynamics and kinematics, between tasks. We exploit this decomposition to train mix-and-match modules that can solve new robot-task combinations that were not seen during training. Using a novel approach to train modular neural networks, we demonstrate the effectiveness of our transfer method for enabling zero-shot generalization with a variety of robots and tasks in simulation for both visual and non-visual tasks.
机译:强化学习(RL)可以使各种各样的机器人技能自动化,但是学习每种新技能都需要大量的实际数据收集和手动表示工程来设计策略类或功能。使用深度强化学习来训练通用神经网络策略,可以通过使用表达策略类来减轻手动表示工程的一些负担,但是加剧了数据收集的挑战,因为这种方法往往比低维,手工的RL效率低。设计的表示形式。转移学习可以使我们将信息从一种技能转移到另一种技能,甚至从一种机器人转移到另一种机器人,从而减轻了这一问题。我们展示了神经网络策略可以分解为“特定于任务”和“特定于机器人”的模块,其中特定于任务的模块在机器人之间共享,特定于机器人的模块在该机器人上的所有任务之间共享。这允许在机器人之间共享任务信息(例如感知),并在任务之间共享机器人信息(例如动力学和运动学)。我们利用这种分解来训练混合搭配模块,这些模块可以解决训练过程中未发现的新机器人任务组合。通过使用一种新颖的方法来训练模块化神经网络,我们演示了我们的传输方法的有效性,该方法可以在视觉和非视觉任务的仿真中使用各种机器人和任务对零镜头进行泛化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号