首页> 外文会议>IFAC World Congress >Model-free Adaptive Dynamic Programming for Optimal Control of Discrete-time Affine Nonlinear System
【24h】

Model-free Adaptive Dynamic Programming for Optimal Control of Discrete-time Affine Nonlinear System

机译:无模型自适应动态规划,可实现离散时间仿射非线性系统的最优控制

获取原文

摘要

In this paper, a model-free and effective approach is proposed to solve infinite horizon optimal control problem for affine nonlinear systems based on adaptive dynamic programming technique. The developed approach, referred to as the actor-critic structure, employs two multilayer perceptron neural networks to approximate the state-action value function and the control policy, respectively. It uses data collected arbitrarily from any reasonable sampling distribution for policy iteration. In the policy evaluation phase, a novel objective function is defined for updating the critic network, and thus makes the critic network converge to the Bellman equation directly rather than iteratively. In the policy improvement phase, the action network is updated to minimize the outputs of the critic network. The two phases alternate until no more improvement of the control policy is observed, such that the optimal control policy is achieved. Two simulation examples are provided to show the effectiveness of the approach.
机译:本文提出了一种基于自适应动态规划技术的仿射非线性系统的无限地平线最佳控制问题。所发达的方法称为演员 - 批评结构,采用两个多层的Perceptron神经网络,分别近似于状态动作值函数和控制策略。它使用任意收集的数据从任何合理的采样分发进行策略迭代。在策略评估阶段,定义了一种新颖的客观函数来更新批评网络,因此使批评网络直接收敛到Bellman方程而不是迭代。在策略改进阶段,更新动作网络以最大限度地减少批评网络的输出。两个阶段替代,直到观察到控制策略的更改,使得实现最佳控制政策。提供了两种模拟示例以显示该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号