首页> 外文会议>International Conference on Robotics and Automation >Reinforcement Learning Meets Hybrid Zero Dynamics: A Case Study for RABBIT
【24h】

Reinforcement Learning Meets Hybrid Zero Dynamics: A Case Study for RABBIT

机译:强化学习遇到零混合动力:RABBIT案例研究

获取原文

摘要

The design of feedback controllers for bipedal robots is challenging due to the hybrid nature of its dynamics and the complexity imposed by high-dimensional bipedal models. In this paper, we present a novel approach for the design of feedback controllers using Reinforcement Learning (RL) and Hybrid Zero Dynamics (HZD). Existing RL approaches for bipedal walking are inefficient as they do not consider the underlying physics, often requires substantial training, and the resulting controller may not be applicable to real robots. HZD is a powerful tool for bipedal control with local stability guarantees of the walking limit cycles. In this paper, we propose a non traditional RL structure that embeds the HZD framework into the policy learning. More specifically, we propose to use RL to find a control policy that maps from the robot's reduced order states to a set of parameters that define the desired trajectories for the robot's joints through the virtual constraints. Then, these trajectories are tracked using an adaptive PD controller. The method results in a stable and robust control policy that is able to track variable speed within a continuous interval. Robustness of the policy is evaluated by applying external forces to the torso of the robot. The proposed RL framework is implemented and demonstrated in OpenAI Gym with the MuJoCo physics engine based on the well-known RABBIT robot model.
机译:双足机器人的反馈控制器的设计具有挑战性,这是由于其动力学的混合特性以及高维双足模型所带来的复杂性。在本文中,我们提出了一种使用强化学习(RL)和混合零动力学(HZD)设计反馈控制器的新颖方法。现有的用于双足步行的RL方法效率低下,因为它们没有考虑底层的物理原理,通常需要进行大量的培训,并且所得的控制器可能不适用于实际的机器人。 HZD是用于两足动物控制的强大工具,可保证步行极限周期的局部稳定性。在本文中,我们提出了一种非传统的RL结构,该结构将HZD框架嵌入到策略学习中。更具体地说,我们建议使用RL来找到控制策略,该策略将从机器人的降阶状态映射到一组参数,这些参数通过虚拟约束为机器人的关节定义所需的轨迹。然后,使用自适应PD控制器跟踪这些轨迹。该方法导致能够在连续间隔内跟踪可变速度的稳定且鲁棒的控制策略。通过对机器人的躯干施加外力来评估策略的鲁棒性。所提出的RL框架是在OpenAI Gym中使用基于众所周知的RABBIT机器人模型的MuJoCo物理引擎实现和演示的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号