首页> 外文会议>International Joint Conference on Neural Networks >Error bound analysis of policy iteration based approximate dynamic programming for deterministic discrete-time nonlinear systems
【24h】

Error bound analysis of policy iteration based approximate dynamic programming for deterministic discrete-time nonlinear systems

机译:确定性离散时间非线性系统基于策略迭代的近似动态规划的误差界分析

获取原文

摘要

Extensive approximate dynamic programming (ADP) algorithms have been developed based on policy iteration. For policy iteration based ADP of deterministic discrete-time nonlinear systems, existing literature has proved its convergence in the formulation of undiscounted value function under the assumption of exact approximation. Furthermore, the error bound of policy iteration based ADP has been analyzed in a discounted value function formulation with consideration of approximation errors. However, there has not been any error bound analysis of policy iteration based ADP in the undiscounted value function formulation with consideration of approximation errors. In this paper, we intend to fill this theoretical gap. We provide a sufficient condition on the approximation error, so that the iterative value function can be bounded in a neighbourhood of the optimal value function. To the best of the authors' knowledge, this is the first error bound result of the undiscounted policy iteration for deterministic discrete-time nonlinear systems considering approximation errors.
机译:已经基于策略迭代开发了广泛的近似动态规划(ADP)算法。对于基于确定性离散时间非线性系统的基于策略迭代的ADP,现有文献证明了在精确逼近假设下无折扣值函数公式的收敛性。此外,已经在考虑了近似误差的情况下,在折现值函数公式中分析了基于策略迭代的ADP的误差范围。但是,在未折现价值函数公式中,没有考虑近似误差的基于策略迭代的ADP的误差限制分析。在本文中,我们打算填补这一理论空白。我们为逼近误差提供了充分的条件,以便可以将迭代值函数限制在最优值函数的附近。据作者所知,这是考虑了近似误差的确定性离散时间非线性系统的无折扣策略迭代的第一个误差限制结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号