首页> 外文期刊>Journal of industrial and management optimization >FINITE-HORIZON OPTIMAL CONTROL OF DISCRETE-TIME LINEAR SYSTEMS WITH COMPLETELY UNKNOWN DYNAMICS USING Q-LEARNING
【24h】

FINITE-HORIZON OPTIMAL CONTROL OF DISCRETE-TIME LINEAR SYSTEMS WITH COMPLETELY UNKNOWN DYNAMICS USING Q-LEARNING

机译:使用Q-Learning完全未知动态的离散时间线性系统的有限视线最优控制

获取原文
获取原文并翻译 | 示例
           

摘要

This paper investigates finite-horizon optimal control problem of completely unknown discrete-time linear systems. The completely unknown here refers to that the system dynamics are unknown. Compared with infinite-horizon optimal control, the Riccati equation (RE) of finite-horizon optimal control is time-dependent and must meet certain terminal boundary constraints, which brings the greater challenges. Meanwhile, the completely unknown system dynamics have also caused additional challenges. The main innovation of this paper is the developed cyclic fixed-finite-horizon-based Q-learning algorithm to approximate the optimal control input without requiring the system dynamics. The developed algorithm main consists of two phases: the data collection phase over a fixed-finite-horizon and the parameters update phase. A least-squares method is used to correlate the two phases to obtain the optimal parameters by cyclic. Finally, simulation results are given to verify the effectiveness of the proposed cyclic fixed-finite-horizon-based Q-learning algorithm.
机译:本文研究了完全未知离散时间线性系统的有限范围最优控制问题。这里完全未知是指系统动态未知。与无限地平线最优控制相比,有限地平线最佳控制的Riccati方程(RE)是时间依赖性的,并且必须满足某些终端边界限制,这带来了更大的挑战。同时,完全未知的系统动态也造成了额外的挑战。本文的主要创新是基于循环固定有限的地平线的Q学习算法,以近似最佳控制输入而不需要系统动态。发达的算法主要由两个阶段组成:数据收集阶段在固定的有限范围和参数更新阶段。最小二乘法用于将两个阶段相关联以通过循环获得最佳参数。最后,给出了仿真结果验证了基于循环固定有限地平线的Q学习算法的有效性。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号