首页> 外文期刊>Mechatronics: The Science of Intelligent Machines >Learning rate free reinforcement learning for real-time motion control using a value-gradient based policy
【24h】

Learning rate free reinforcement learning for real-time motion control using a value-gradient based policy

机译:使用基于价值梯度的策略进行实时运动控制的无学习率强化学习

获取原文
获取原文并翻译 | 示例
       

摘要

Reinforcement learning (RL) is a framework that enables a controller to find an optimal control policy for a task in an unknown environment. Although RL has been successfully used to solve optimal control problems, learning is generally slow. The main causes are the inefficient use of information collected during interaction with the system and the inability to use prior knowledge on the system or the control task. In addition, the learning speed heavily depends on the learning rate parameter, which is difficult to tune. In this paper, we present a sample-efficient, learning-rate-free version of the Value-Gradient Based Policy (VGBP) algorithm. The main difference between VGBP and other frequently used algorithms, such as Sarsa, is that in VGBP the learning agent has a direct access to the reward function, rather than just the immediate reward values. Furthermore, the agent learns a process model. This enables the algorithm to select control actions by optimizing over the right-hand side of the Bellman equation. We demonstrate the fast learning convergence in simulations and experiments with the underactuated pendulum swing-up task. In addition, we present experimental results for a more complex 2-DOF robotic manipulator. (C) 2014 Elsevier Ltd. All rights reserved.
机译:强化学习(RL)是一个框架,使控制器能够为未知环境中的任务找到最佳的控制策略。尽管RL已成功用于解决最佳控制问题,但学习通常较慢。主要原因是与系统交互过程中收集的信息使用效率低下,以及无法使用有关系统或控制任务的先验知识。另外,学习速度在很大程度上取决于学习速率参数,该参数很难调整。在本文中,我们提出了基于值梯度的策略(VGBP)算法的高效样本,无学习率版本。 VGBP与其他常用算法(例如Sarsa)之间的主要区别在于,在VGBP中,学习代理可以直接访问奖励功能,而不仅仅是直接的奖励值。此外,代理学习过程模型。这使算法能够通过对Bellman方程的右侧进行优化来选择控制动作。我们在模拟和实验中证明了快速学习收敛与欠驱动摆摆动任务。此外,我们提出了一种更复杂的2-DOF机器人操纵器的实验结果。 (C)2014 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号