Training a robust reinforcement learning controller for the uncertain system based on policy gradient method

Li Zhan; Xue Shengri; Lin Weiyang; Tong Mingsi

首页> 外文期刊>Neurocomputing >Training a robust reinforcement learning controller for the uncertain system based on policy gradient method

【24h】

Training a robust reinforcement learning controller for the uncertain system based on policy gradient method

机译：基于策略梯度法的不确定系统鲁棒强化学习控制器训练

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The target of this paper is to design a model-free robust controller for uncertain systems. The uncertainties of the control system mainly consists of model uncertainty and external disturbance, which widely exist in the practical utilization. These uncertainties will negatively influence the system performance and this motivates us to train a model-free controller to solve this problem. Reinforcement learning is an important branch of machine learning and is able to achieve well performed control results by optimizing a policy without the knowledge of mathematical plant model. In this paper, we construct a reward function module to describe the specific environment of the concerned system, taking uncertainties into account. Then we utilize a new policy gradient method to optimize the policy and implement this algorithm with the actor-critic structure neuro networks. These two networks are our reinforcement learning controllers. Finally, we illustrate the applicability and efficiency of the proposed method by applying it on an experimental helicopter platform model, which includes model uncertainties and external disturbances. (C) 2018 Elsevier B.V. All rights reserved.

机译：本文的目标是为不确定系统设计一种无模型的鲁棒控制器。控制系统的不确定性主要由模型不确定性和外部干扰组成，在实际应用中广泛存在。这些不确定性将对系统性能产生负面影响，这促使我们训练无模型控制器来解决此问题。强化学习是机器学习的重要分支，并且能够在不了解数学工厂模型的情况下通过优化策略来获得性能良好的控制结果。在本文中，我们构建了一个奖励函数模块来描述相关系统的特定环境，同时考虑了不确定性。然后，我们使用一种新的策略梯度方法来优化策略，并使用行为者-批评者结构神经网络来实现该算法。这两个网络是我们的强化学习控制器。最后，我们通过将其应用于实验直升机平台模型来说明该方法的适用性和效率，该模型包括模型不确定性和外部干扰。（C）2018 Elsevier B.V.保留所有权利。

著录项

来源
《Neurocomputing》 |2018年第17期|313-321|共9页
作者
Li Zhan; Xue Shengri; Lin Weiyang; Tong Mingsi;
展开▼
作者单位

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Reinforcement-Learning-Based Robust Controller Design for Continuous-Time Uncertain Nonlinear Systems Subject to Input Constraints [J] . Liu Derong, Yang Xiong, Wang Ding, Cybernetics, IEEE Transactions on . 2015,第7期

机译：具有输入约束的不确定非线性系统的基于强化学习的鲁棒控制器设计
2. Data-Driven Robust Control of Discrete-Time Uncertain Linear Systems via Off-Policy Reinforcement Learning [J] . Neural Networks and Learning Systems, IEEE Transactions on . 2019,第12期

机译：基于非策略强化学习的离散时间不确定线性系统的数据驱动鲁棒控制
3. Robust control scheme for a class of uncertain nonlinear systems with completely unknown dynamics using data-driven reinforcement learning method [J] . Jiang He, Zhang Huaguang, Cui Yang, Neurocomputing . 2018,第jana17期

机译：一类动力学完全未知的不确定非线性系统的鲁棒控制方案，采用数据驱动的强化学习方法
4. Off-policy Reinforcement Learning for Robust Control of Discrete-time Uncertain Linear Systems [C] . Yongliang Yang, Zhishan Guo, Donald Wunsch, Chinese Control Conference . 2017

机译：离散时间不确定线性系统鲁棒控制的禁止策略加固学习
5. Training Physics-Based Controllers for Articulated Characters with Deep Reinforcement Learning [D] . Biswas, Avishek. 2021

机译：培养基于物理的控制器，用于铰接性的人物，深增强学习
6. Correction: Spike-Based Reinforcement Learning in Continuous State and Action Space: When Policy Gradient Methods Fail [O] . Eleni Vasilaki, Nicolas Frémaux, Robert Urbanczik, 2009

机译：更正：在连续状态和动作空间中基于峰值的强化学习：当策略梯度方法失败时
7. Using policy gradient reinforcement learning on autonomous robot controllers [O] . Grudic, Gregory Z, Kumar, R. Vijay, Ungar, Lyle H 2003

机译：在自主机器人控制器上使用策略梯度强化学习

Training a robust reinforcement learning controller for the uncertain system based on policy gradient method

摘要

著录项

相似文献

相关主题

期刊订阅