Fast biped walking with a reflexive neuronal controller and policy gradient reinforcement learning

机译：快速搭配，具有反射神经元控制器和政策梯度加固学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we present our design and experiments of a planar biped robot ("RunBot") under pure reflexive neuronal control. The goal of this study is to combine neuronal mechanisms with biomechanics to obtain very fast speed and the on-line learning of circuit parameters. Our controller is built with biologically inspired sensor- and motor-neuron models, including local reflexes and not employing any kind of position or trajectory-tracking control algorithm. Instead, this reflexive controller allows RunBot to exploit its own natural dynamics during critical stages of its walking gait cycle. To our knowledge, this is the first time that dynamic biped walking is achieved using only a pure reflexive controller. In addition, this structure allows using a policy gradient reinforcement learning algorithm to tune the parameters of the reflexive controller in real-time during walking. This way RunBot can reach a relative speed of 3.5 leg-lengths per second after a few minutes of online learning, which is faster than that of any other biped robot, and is also comparable to the fastest relative speed of human walking. In addition, the stability domain of stable walking is quite large supporting this design strategy.

机译：在本文中，我们在纯反应神经元控制下展示了我们的平面Biped机器人（“runbot”）的设计和实验。本研究的目标是将神经元机制与生物力学结合起来，以获得非常快速的电路参数的速度和在线学习。我们的控制器采用生物启发的传感器和电机 - 神经元模型，包括本地反射，而不是采用任何类型的位置或轨迹跟踪控制算法。相反，该反射控制器允许runbot在其行走步态周期的关键阶段来利用自己的自然动态。为了我们的知识，这是第一次只使用纯反射控制器实现动态双面行走。此外，这种结构允许使用策略梯度加强学习算法在步行期间实时调整反射控制器的参数。这种方式在在线学习几分钟后，runbot可以达到每秒3.5腿长度的相对速度，这比任何其他双方机器人的速度快，也与人类行走最快的相对速度相媲美。此外，稳定行走的稳定领域非常大，支持这种设计策略。

著录项

来源
《International Symposium on Adaptive Motion in Animals and Machines》|2005年||共6页
会议地点
作者
Tao Geng; Bernd Porr; Florentin Woergoetter;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP24-53;
关键词
Biped; Reflexive controller; Policy searching;

机译：Biped;反身控制器;政策搜索;

相似文献

外文文献
中文文献
专利

1. Fast biped walking with a sensor-driven neuronal controller and real-time online learning [J] . Geng T, Porr B, Worgotter F The International journal of robotics research . 2006,第3期

机译：利用传感器驱动的神经元控制器和实时在线学习进行快速两足动物步行
2. Off-Policy Natural Policy Gradient Method for a Biped Walking Using a CPG Controller [J] . Yutaka Nakamura, Takeshi Mori, Yoichi Tokita, Journal of robotics and mechatronics . 2005,第6期

机译：使用CPG控制器的Biped步行的非政策自然政策梯度方法
3. Learning a Dynamic Policy by Using Policy Gradient: Application to Biped Walking [J] . Takamitsu Matsubara, Jun Morimoto, Jun Nakanishi, Systems and Computers in Japan . 2007,第4期

机译：通过使用策略梯度学习动态策略：在Biped步行中的应用
4. Fast biped walking with a reflexive neuronal controller and policy gradient reinforcement learning [C] . Tao Geng, Bernd Porr, Florentin Woergoetter International Symposium on Adaptive Motion in Animals and Machines . 2005

机译：快速搭配，具有反射神经元控制器和政策梯度加固学习
5. Biped dynamic walking using reinforcement learning [D] . Benbrahim, Hamid. 1996

机译：使用强化学习的两足动物动态步行
6. Adaptive Fast Walking in a Biped Robot under Neuronal Control and Learning [O] . Poramate Manoonpong, Tao Geng, Tomas Kulvicius, 2007

机译：在神经元控制和学习下的两足动物机器人中的自适应快速行走
7. Bipedal Walking Energy Minimization by Reinforcement Learning with Evolving Policy Parameterization [O] . Kormushev P, Ugurlu B, Calinon S, 2011

机译：通过不断演变的政策参数化进行强化学习，最大限度地减少步行能量

Fast biped walking with a reflexive neuronal controller and policy gradient reinforcement learning

摘要

著录项

相似文献

相关主题

期刊订阅