首页> 外文会议>International Joint Conference on Neural Networks >Error bound analysis of policy iteration based approximate dynamic programming for deterministic discrete-time nonlinear systems

【24h】

Error bound analysis of policy iteration based approximate dynamic programming for deterministic discrete-time nonlinear systems

机译：确定性离散时间非线性系统基于策略迭代的近似动态规划的误差界分析

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Extensive approximate dynamic programming (ADP) algorithms have been developed based on policy iteration. For policy iteration based ADP of deterministic discrete-time nonlinear systems, existing literature has proved its convergence in the formulation of undiscounted value function under the assumption of exact approximation. Furthermore, the error bound of policy iteration based ADP has been analyzed in a discounted value function formulation with consideration of approximation errors. However, there has not been any error bound analysis of policy iteration based ADP in the undiscounted value function formulation with consideration of approximation errors. In this paper, we intend to fill this theoretical gap. We provide a sufficient condition on the approximation error, so that the iterative value function can be bounded in a neighbourhood of the optimal value function. To the best of the authors' knowledge, this is the first error bound result of the undiscounted policy iteration for deterministic discrete-time nonlinear systems considering approximation errors.

机译：已经基于策略迭代开发了广泛的近似动态规划（ADP）算法。对于基于确定性离散时间非线性系统的基于策略迭代的ADP，现有文献证明了在精确逼近假设下无折扣值函数公式的收敛性。此外，已经在考虑了近似误差的情况下，在折现值函数公式中分析了基于策略迭代的ADP的误差范围。但是，在未折现价值函数公式中，没有考虑近似误差的基于策略迭代的ADP的误差限制分析。在本文中，我们打算填补这一理论空白。我们为逼近误差提供了充分的条件，以便可以将迭代值函数限制在最优值函数的附近。据作者所知，这是考虑了近似误差的确定性离散时间非线性系统的无折扣策略迭代的第一个误差限制结果。

著录项

来源
《International Joint Conference on Neural Networks》|2015年|1-8|共8页
会议地点
作者
Guo Wentao; Liu Feng; Si Jennie; Mei Shengwei; Li Rui;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A novel stable value iteration-based approximate dynamic programming algorithm for discrete-time nonlinear systems [J] . Yan-Hua Qu, An-Na Wang, Sheng Lin 中国物理：英文版 . 2018,第001期

机译：离散非线性系统基于稳定值迭代的新型近似动态规划算法
2. Stable iterative adaptive dynamic programming algorithm with approximation errors for discrete-time nonlinear systems [J] . Qinglai Wei, Derong Liu Neural computing & applications . 2014,第6期

机译：离散非线性系统具有近似误差的稳定迭代自适应动态规划算法
3. Generalized Policy Iteration Adaptive Dynamic Programming for Discrete-Time Nonlinear Systems [J] . Liu Derong, Wei Qinglai, Yan Pengfei IEEE Transactions on Systems, Man, and Cybernetics . 2015,第12期

机译：离散非线性系统的广义策略迭代自适应动态规划
4. Error Bound Analysis of Policy Iteration Based Approximate Dynamic Programming for Deterministic Discrete-time Nonlinear Systems [C] . Wentao Guo, Feng Liu, Jennie Si, International Joint Conference on Neural Networks . 2015

机译：基于近似动态规划的确定性离散时间非线性系统误差分析
5. Energy Storage Applications of the Knowledge Gradient for Calibrating Continuous Parameters, Approximate Policy Iteration using Bellman Error Minimization with Instrumental Variables, and Covariance Matrix Estimation using an Errors-in-Variables Factor Model. [D] . Scott, Warren Robert. 2012

机译：知识梯度的能量存储应用，用于校准连续参数，使用带工具变量的Bellman误差最小化进行近似策略迭代以及使用可变误差因子模型进行协方差矩阵估计。
6. An LMI Based Criterion for Global Asymptotic Stability of Discrete-Time State-Delayed Systems with Saturation Nonlinearities [O] . Priyanka Kokil 2014

机译：具有饱和非线性的离散状态时滞系统的全局渐近稳定性的基于LMI的准则
7. An Approximate Stability Analysis of Discrete-Time Basis Nonlinear Dynamic Systems Using Universal Learning Networks [O] . Kotaro Hirasawa, Jinglu Hu, Yunqing Yu, 2004

机译：使用通用学习网络的离散时间基础非线性动态系统的近似稳定性分析
8. Approximate and Local Linearizability of Nonlinear Discrete-Time Systems [R] . Lee, H., Marcus, S. I. 1986

机译：非线性离散时间系统的近似和局部线性化

Error bound analysis of policy iteration based approximate dynamic programming for deterministic discrete-time nonlinear systems

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅