首页> 外文期刊>Advanced Robotics: The International Journal of the Robotics Society of Japan >Pneumatic artificial muscle-driven robot control using local update reinforcement learning
【24h】

Pneumatic artificial muscle-driven robot control using local update reinforcement learning

机译:基于局部更新强化学习的气动人工肌肉驱动机器人控制

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

In this study, a new value function based Reinforcement learning (RL) algorithm, Local Update Dynamic Policy Programming (LUDPP), is proposed. It exploits the nature of smooth policy update using Kullback-Leibler divergence to update its value function locally and considerably reduces the computational complexity. We firstly investigated the learning performance of LUDPP and other algorithms without smooth policy update for tasks of pendulum swing up and n DOFs manipulator reaching in simulation. Only LUDPP could efficiently and stably learn good control policies in high dimensional systems with limited number of training samples. In real word application, we applied LUDPP to control Pneumatic Artificial Muscles (PAMs) driven robots without the knowledge of model which is challenging for traditional methods due to the high nonlinearities of PAM's air pressure dynamics and mechanical structure. LUDPP successfully achieved one finger control of Shadow Dexterous Hand, a PAM-driven humanoid robot hand, with far lower computational resource compared with other conventional value function based RL algorithms.
机译:该文提出了一种新的基于价值函数的强化学习(RL)算法——局部更新动态策略编程(LUDPP)。它利用了使用 Kullback-Leibler 散度的平滑策略更新的性质,在本地更新其值函数,并大大降低了计算复杂性。首先,在模拟中研究了LUDPP和其他算法在未平滑策略更新的情况下对钟摆摆动和n个自由度机械手到达任务的学习性能。只有LUDPP能够在训练样本数量有限的高维系统中高效、稳定地学习良好的控制策略。在实际应用中,我们应用LUDPP来控制气动人工肌肉(PAMs)驱动的机器人,而无需了解模型,由于PAM的气压动力学和机械结构具有高度的非线性,这对传统方法具有挑战性。LUDPP成功实现了PAM驱动的人形机器人手Shadow Dexterous Hand的单指控制,与其他传统的基于值函数的RL算法相比,计算资源要低得多。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号