首页> 外文会议>AIAA guidance, navigation, and control conference >Adaptive Optimal Control of Partially-unknown Constrained-input Systems using Policy Iteration with Experience Replay
【24h】

Adaptive Optimal Control of Partially-unknown Constrained-input Systems using Policy Iteration with Experience Replay

机译:具有经验重放的策略迭代对部分未知约束输入系统的自适应最优控制

获取原文

摘要

This paper develops an online learning algorithm to find optimal control solutions for partially-unknown continuous-time systems subject to input constraints. The input constraints are encoded into the optimal control problem through a nonquadratic performance functional. An online policy iteration algorithm that uses integral reinforcement knowledge is developed to learn the solution to the optimal control problem online without knowing the full dynamics model. The policy iteration algorithm is implemented on an actor-critic structure, where two neural network approximators are tuned online and simultaneously to generate the optimal control law. A novel technique based on experience replay is introduced to retain past data in updating the neural network weights. This uses the recorded data concurrently with current data for adaptation of the critic neural network weights. Concurrent learning provides an easy-to-check real-time condition for persistence of excitation that is sufficient to guarantee convergence to a near optimal control law. Stability of the proposed feedback control law is shown and its performance is evaluated through simulations.
机译:本文开发了一种在线学习算法,以找到受输入约束的部分未知连续时间系统的最优控制解决方案。通过非二次性能函数将输入约束编码为最优控制问题。开发了一种使用积分强化知识的在线策略迭代算法,以在不了解完整动力学模型的情况下在线学习最优控制问题的解决方案。策略迭代算法是在执行者-批判结构上实现的,在该结构上,两个神经网络逼近器在网上同时进行调整,以生成最佳控制律。引入了一种基于经验重播的新技术,以在更新神经网络权重时保留过去的数据。这将记录的数据与当前数据同时使用,以适应评论者神经网络权重。并发学习为激励的持久性提供了易于检查的实时条件,足以保证收敛到接近最佳的控制律。显示了所提出的反馈控制律的稳定性,并通过仿真评估了其性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号