首页> 外文期刊>Engineering Applications of Artificial Intelligence >Reinforcement learning for quadrupedal locomotion with design of continual-hierarchical curriculum
【24h】

Reinforcement learning for quadrupedal locomotion with design of continual-hierarchical curriculum

机译:跨越跨等级课程设计的Quadrupedal Lofomotion的加固学习

获取原文
获取原文并翻译 | 示例

摘要

End-to-end reinforcement learning is a promising approach to enable robots to acquire complicated skills. However, this requires numerous samples to be implemented successfully. The issue is that it is often difficult to collect the sufficient number of samples. To accelerate learning in the field of robotics, knowledge gathered from robotics engineering and previously learned tasks must be fully exploited. Specifically, we propose using a sample-efficient curriculum to establish quadrupedal robot control in which the walking and turning tasks are divided into two hierarchical layers, and a robot learns them incrementally from lower to upper layers. To develop such a curriculum, two core components are designed. First the fractal design of neural networks in reservoir computing is aimed at allocating the tasks to be learned to respective modules in fractal networks. This allows mitigating the problem of catastrophic forgetting in neural networks and achieves the capability of continuous learning. The second task includes hierarchical task decomposition according to robotics knowledge for controlling legged robots. Owing to the combination of these two components, the proposed curriculum enables a robot to tune the lower layer even when the upper layer is optimized. As a result of implementing the proposed design, we confirm that a quadrupedal robot in a dynamical simulator succeeds in learning skills hierarchically according to the given curriculum, starting from moving legs and finally, walking/turning, unlike the considered conventional curriculums that are unable to achieve such results.
机译:端到端的加强学习是一种有希望的方法,使机器人能够获得复杂的技能。但是,这需要成功实施许多样本。问题是,通常难以收集足够数量的样本。为了加速机器人领域的学习,必须充分利用从机器人工程和先前学习的任务收集的知识。具体地,我们建议使用采样的高效课程来建立四足球机器人控制,其中步行和转向任务被分成两个分层层,并且机器人从下层逐渐学习它们。要开发这样的课程,请设计两个核心组件。首先,储层计算中神经网络的分形设计旨在分配分形网络中的各个模块的任务。这允许减轻神经网络灾难性遗忘的问题,实现了持续学习的能力。第二任务包括根据用于控制腿机器人的机器人知识的分层任务分解。由于这两个组件的组合,所提出的课程使机器人能够即使当上层优化时也能够调谐下层。由于实施了所提出的设计,我们确认动态模拟器中的四足球机器人根据给定的课程,从移动腿开始,最后,步行/转动,与所考虑的传统课程不同实现此类结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号