首页> 美国卫生研究院文献>Frontiers in Neurorobotics >A novel approach to locomotion learning: Actor-Critic architecture using central pattern generators and dynamic motor primitives
【2h】

A novel approach to locomotion learning: Actor-Critic architecture using central pattern generators and dynamic motor primitives

机译:运动学习的新方法:使用中央模式生成器和动态运动原语的Actor-Critic体系结构

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In this article, we propose an architecture of a bio-inspired controller that addresses the problem of learning different locomotion gaits for different robot morphologies. The modeling objective is split into two: baseline motion modeling and dynamics adaptation. Baseline motion modeling aims to achieve fundamental functions of a certain type of locomotion and dynamics adaptation provides a “reshaping” function for adapting the baseline motion to desired motion. Based on this assumption, a three-layer architecture is developed using central pattern generators (CPGs, a bio-inspired locomotor center for the baseline motion) and dynamic motor primitives (DMPs, a model with universal “reshaping” functions). In this article, we use this architecture with the actor-critic algorithms for finding a good “reshaping” function. In order to demonstrate the learning power of the actor-critic based architecture, we tested it on two experiments: (1) learning to crawl on a humanoid and, (2) learning to gallop on a puppy robot. Two types of actor-critic algorithms (policy search and policy gradient) are compared in order to evaluate the advantages and disadvantages of different actor-critic based learning algorithms for different morphologies. Finally, based on the analysis of the experimental results, a generic view/architecture for locomotion learning is discussed in the conclusion.
机译:在本文中,我们提出了一种受生物启发的控制器的体系结构,该体系结构解决了针对不同的机器人形态学习不同的运动步态的问题。建模目标分为两个部分:基线运动建模和动力学适应。基线运动建模旨在实现某种运动的基本功能,而动力学适应则提供了一种“重塑”功能,用于使基线运动适应所需的运动。基于此假设,使用中央模式发生器(CPG,用于基线运动的受生物启发的运动中心)和动态运动基元(DMP,具有通用“重塑”功能的模型)开发了三层体系结构。在本文中,我们将这种体系结构与参与者评判算法结合使用,以找到良好的“重塑”功能。为了证明基于行为准则的体系结构的学习能力,我们在两个实验上对其进行了测试:(1)学习在类人动物上爬行,(2)学习在小狗机器人上驰gall。比较了两种类型的actor-critic算法(策略搜索和策略梯度),以评估针对不同形态的基于actor-critic的学习算法的优缺点。最后,基于对实验结果的分析,结论中讨论了运动学习的通用视图/体系结构。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号