首页> 外文会议>Chinese Automation Congress >Heuristic Gait Learning of Quadruped Robot Based on Deep Deterministic Policy Gradient Algorithm
【24h】

Heuristic Gait Learning of Quadruped Robot Based on Deep Deterministic Policy Gradient Algorithm

机译:基于深度确定性政策梯度算法的四足机器人启发式步态学习

获取原文

摘要

The gait control of the quadruped robot has always been a hot topic in the field of robot research. At present, the traditional control methods have many limitations such as low intelligence and poor autonomy. With the development of artificial intelligence technology, the application of reinforcement learning to the quadruped robot autonomous learning strategy provides a promising solution. Deep deterministic policy gradient (DDPG) algorithm has achieved good performance in continuous control tasks, but such value-based reinforcement learning algorithms have the problem of too high epoch estimates when performing function approximation, then reached a bad strategy actually. In order to solve the above-mentioned problem, this paper proposed a heuristic gait learning method for quadruped robot based on DDPG, inspired by the Double Q-learning algorithm, two independent critics were used to select the smaller value to update the parameters. The Open AI Gym platform was used for experimental verification, which proved that the proposed improved DDPG algorithm had better performance.
机译:四足机器人的步态控制一直是机器人研究领域的热门话题。目前,传统的控制方法具有许多局限性,例如低智力和自主性差。随着人工智能技术的发展,加强学习在四足机器人自主学习策略中的应用提供了有希望的解决方案。深度确定性政策梯度(DDPG)算法在连续控制任务中取得了良好的性能,但是这种基于价值的增强学习算法在执行函数近似时具有太高的时期估计的问题,然后实际达到了不良策略。为了解决上述问题,本文提出了一种基于DDPG的四足机器人的启发式步态学习方法,受到双Q学习算法的启发,使用了两个独立的批评者来选择更新参数的较小值。 Open AI Gym平台用于实验验证,这证明了提出的改进的DDPG算法具有更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号