...
首页> 外文期刊>IFAC PapersOnLine >Reinforcement Learning of Potential Fields to achieve Limit-Cycle Walking
【24h】

Reinforcement Learning of Potential Fields to achieve Limit-Cycle Walking

机译:增强潜在场的学习以实现极限循环行走

获取原文

摘要

Reinforcement learning is a powerful tool to derive controllers for systems where no models are available. Particularly policy search algorithms are suitable for complex systems, to keep learning time manageable and account for continuous state and action spaces. However, these algorithms demand more insight into the system to choose a suitable controller parameterization. This paper investigates a type of policy parameterization for impedance control that allows energy input to be implicitly bounded: Potential fields. In this work, a methodology for generating a potential field-constrained impedance controller via approximation of example trajectories, and subsequently improving the control policy using Reinforcement Learning, is presented. The potential field-const rained approximation is used as a policy parameterization for policy search reinforcement learning and is compared to its unconstrained counterpart. Simulations on a simple biped walking model show the learned controllers are able to surpass the potential field of gravity by generating a stable limit-cycle gait on flat ground for both parameterizations. The potential field-constrained controller provides safety with a known energy bound while performing equally well as the unconstrained policy.
机译:增强学习是为没有模型可用的系统派生控制器的强大工具。尤其是策略搜索算法适用于复杂的系统,以保持学习时间可管理并考虑连续的状态和动作空间。但是,这些算法需要对系统有更多了解,以选择合适的控制器参数化。本文研究了一种用于阻抗控制的策略参数化方法,该方法可隐式限制能量输入:势场。在这项工作中,提出了一种方法,该方法可通过示例轨迹的逼近来生成势场受限的阻抗控制器,然后使用强化学习来改进控制策略。潜在的场常量降雨近似用作策略搜索强化学习的策略参数化,并与它的无约束对应项进行比较。在简单的两足动物步行模型上进行的仿真表明,通过学习,控制器可以通过在两个参数化的平坦地面上生成稳定的极限循环步态,从而超越潜在的重力场。潜在的受现场约束的控制器以已知的能量范围为安全提供了保障,同时其性能与不受约束的策略一样好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号