...
首页> 外文期刊>International Journal of Control >Adaptive dynamic programming for discrete-time linear quadratic regulation based on multirate generalised policy iteration
【24h】

Adaptive dynamic programming for discrete-time linear quadratic regulation based on multirate generalised policy iteration

机译:基于多素广义政策迭代的离散时间线性二次调节自适应动态规划

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In this paper, we propose two multirate generalised policy iteration (GPI) algorithms applied to discrete-time linear quadratic regulation problems. The proposed algorithms are extensions of the existing GPI algorithm that consists of the approximate policy evaluation and policy improvement steps. The two proposed schemes, named heuristic dynamic programming (HDP) and dual HDP (DHP), based on multirate GPI, use multi-step estimation (M-step Bellman equation) at the approximate policy evaluation step for estimating the value function and its gradient called costate, respectively. Then, we show that these two methods with the same update horizon can be considered equivalent in the iteration domain. Furthermore, monotonically increasing and decreasing convergences, so called value iteration (VI)-mode and policy iteration (PI)-mode convergences, are proved to hold for the proposed multirate GPIs. Further, general convergence properties in terms of eigenvalues are also studied. The data-driven online implementation methods for the proposed HDP and DHP are demonstrated and finally, we present the results of numerical simulations performed to verify the effectiveness of the proposed methods.
机译:在本文中,我们提出了应用于离散时间线性二次调节问题的两个多型广义政策迭代(GPI)算法。所提出的算法是现有GPI算法的扩展,包括近似的策略评估和策略改进步骤。基于多速率GPI的两个提出的方案,名为HeurisiC动态编程(HDP)和双HDP(DHP),在近似策略评估步骤中使用多步估计(M-STEP BELLMAN方程)来估计值函数及其梯度分别称为成本。然后,我们表明这两种具有相同更新视界的方法可以在迭代域中被认为是等同的。此外,汇总和减少的收敛,因此被称为价值迭代(VI)-Mode和策略迭代(PI)-Mode收敛,以保持提出的多速率GPI。此外,还研究了在特征值方面的一般收敛性质。已经证明了建议的HDP和DHP的数据驱动的在线实现方法,最后,我们介绍了验证所提出的方法的有效性的数值模拟结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号