首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Output Feedback Q-Learning Control for the Discrete-Time Linear Quadratic Regulator Problem
【24h】

Output Feedback Q-Learning Control for the Discrete-Time Linear Quadratic Regulator Problem

机译:输出反馈Q学习控制,用于离散时间线性二次调节器问题

获取原文
获取原文并翻译 | 示例

摘要

Approximate dynamic programming (ADP) and reinforcement learning (RL) have emerged as important tools in the design of optimal and adaptive control systems. Most of the existing RL and ADP methods make use of full-state feedback, a requirement that is often difficult to satisfy in practical applications. As a result, output feedback methods are more desirable as they relax this requirement. In this paper, we present a new output feedback-based Q-learning approach to solving the linear quadratic regulation (LQR) control problem for discrete-time systems. The proposed scheme is completely online in nature and works without requiring the system dynamics information. More specifically, a new representation of the LQR Q-function is developed in terms of the input-output data. Based on this new Q-function representation, output feedback LQR controllers are designed. We present two output feedback iterative Q-learning algorithms based on the policy iteration and the value iteration methods. This scheme has the advantage that it does not incur any excitation noise bias, and therefore, the need of using discounted cost functions is circumvented, which in turn ensures closed-loop stability. It is shown that the proposed algorithms converge to the solution of the LQR Riccati equation. A comprehensive simulation study is carried out, which illustrates the proposed scheme.
机译:近似动态编程(ADP)和强化学习(RL)已成为最佳和自适应控制系统设计中的重要工具。大多数现有的RL和ADP方法利用全态反馈,要求在实际应用中难以满足。结果,在放松此要求时,输出反馈方法更为希望。在本文中,我们提出了一种基于新的输出反馈的Q学习方法来解决离散时间系统的线性二次调节(LQR)控制问题。拟议的计划在性质上完全在线,并在不需要系统动态信息的情况下进行。更具体地,在输入输出数据方面开发了LQR Q函数的新表示。基于此新的Q功能表示,设计了输出反馈LQR控制器。我们基于策略迭代和值迭代方法呈现两个输出反馈迭代Q学习算法。该方案具有以下优点:它不产生任何激励噪声偏差,因此,避免使用折扣成本函数的需要,这反过来确保闭环稳定性。结果表明,所提出的算法会聚到LQR Riccati方程的解。进行了全面的仿真研究,说明了所提出的方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号