...
首页> 外文期刊>International Journal of Information Technology and Computer Science >Augmented Random Search for Quadcopter Control: An alternative to Reinforcement Learning
【24h】

Augmented Random Search for Quadcopter Control: An alternative to Reinforcement Learning

机译:增强随机搜索的四轴飞行器控制:强化学习的替代方法

获取原文
           

摘要

Model-based reinforcement learning strategies are believed to exhibit more significant sample complexity than model-free strategies to control dynamical systems, such as quadcopters. This belief that Model-based strategies that involve the use of well-trained neural networks for making such high-level decisions always give better performance can be dispelled by making use of Model-free policy search methods. This paper proposes the use of a model-free random searching strategy, called Augmented Random Search (ARS), which is a better and faster approach of linear policy training for continuous control tasks like controlling a Quadcopter’s flight. The method achieves state-of-the-art accuracy by eliminating the use of too much data for the training of neural networks that are present in the previous approaches to the task of Quadcopter control. The paper also highlights the performance results of the searching strategy used for this task in a strategically designed task environment with the help of simulations. Reward collection performance over 1000 episodes and agent’s behavior in flight for augmented random search is compared with that of the behavior for reinforcement learning state-of- the-art algorithm, called Deep Deterministic policy gradient(DDPG) Our simulations and results manifest that a high variability in performance is observed in commonly used strategies for sample efficiency of such tasks but the built policy network of ARS-Quad can react relatively accurately to step response providing a better performing alternative to reinforcement learning strategies.
机译:据信,基于模型的强化学习策略比无模型策略(如四旋翼飞行器)具有更大的样本复杂性。可以通过使用无模型策略搜索方法来消除这种基于模型的策略,该策略涉及使用训练有素的神经网络来做出这样的高级决策,始终可以提供更好的性能。本文提出了一种使用无模型随机搜索策略(称为增强随机搜索(ARS))的方法,该策略是一种线性策略训练的更好,更快的方法,用于进行连续控制任务,例如控制四旋翼飞行器。通过消除使用过多数据来训练神经网络的方法,该方法达到了最先进的精度,而以前的解决方法是使用四轴飞行器控制任务。本文还重点介绍了在战略设计的任务环境中借助模拟对用于此任务的搜索策略的性能结果。我们将超过1000集的奖励收集性能以及代理商在飞行中进行增强随机搜索的行为与增强学习行为的行为进行了比较,该算法是一种称为“深度确定性策略梯度”(DDPG)的最新算法。我们的仿真和结果表明,在用于此类任务的样本效率的常用策略中观察到性能的变化,但是ARS-Quad的已建立策略网络可以对步阶响应做出相对准确的反应,从而为强化学习策略提供更好的替代选择。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号