首页> 外文会议>Annual American Control Conference >Output Feedback Reinforcement Learning Control for the Continuous-Time Linear Quadratic Regulator Problem
【24h】

Output Feedback Reinforcement Learning Control for the Continuous-Time Linear Quadratic Regulator Problem

机译:连续时间线性二次调节器问题的输出反馈强化学习控制

获取原文

摘要

In this paper, we present an output feedback reinforcement learning scheme to solve the LQR problem for continuous-time linear systems. The problem consists of finding the optimal feedback gain to achieve asymptotic stability without the knowledge of system dynamics and the information of the full state. An output feedback policy iteration algorithm is proposed that iteratively solves the ADP Bellman equation to find the optimal control parameters. Unlike the existing methods, the proposed scheme does not require any discrete approximation, and is not affected by the excitation noise bias. As a result, the need of a discounting factor, which has been a bottleneck in the past in achieving stability guarantee, is eliminated. The learned control parameters are optimal and match exactly the solution of the LQR Riccati equation. Simulation results show the effectiveness of the proposed scheme.
机译:在本文中,我们提出了一种输出反馈强化学习方案,以解决连续时间线性系统的LQR问题。问题在于寻找最佳反馈增益以实现渐近稳定性,而无需了解系统动力学和全状态信息。提出了一种输出反馈策略迭代算法,该算法迭代求解ADP Bellman方程以找到最佳控制参数。与现有方法不同,所提出的方案不需要任何离散近似,并且不受激励噪声偏置的影响。结果,消除了过去一直是实现稳定性保证的瓶颈的折现因子的需要。学习的控制参数是最佳的,并且与LQR Riccati方程的解完全匹配。仿真结果表明了该方案的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号