首页> 外文期刊>Optimal Control Applications and Methods >Optimal tracking control for non-zero-sum games of linear discrete-time systems via off-policy reinforcement learning
【24h】

Optimal tracking control for non-zero-sum games of linear discrete-time systems via off-policy reinforcement learning

机译:通过截止策略强化学习对线性离散时间系统非零和游戏的最佳跟踪控制

获取原文
获取原文并翻译 | 示例
       

摘要

In this article, a model-free off-policy reinforcement learning algorithm is applied to address the optimal tracking problem based on multiplayer non-zero-sum games for discrete-time linear systems. In contrast to the traditional method and the policy iteration method for solving the optimal tracking problems, the proposed algorithm operates with the system data rather than the knowledge of the system dynamics. For performing the proposed algorithm, an auxiliary augmented system is constructed via assembling the original system and the reference trajectory while a discount factor is introduced into the performance indexes. It is analyzed that the solutions of the proposed algorithm converge to the Nash equilibrium and the result is not influenced by the probing noise. Two simulations are presented to verify the feasibility and effectiveness of the proposed algorithm.
机译:在本文中,应用了无模型的脱助策略加强学习算法来解决基于用于离散时间线性系统的多人非零和游戏的最佳跟踪问题。 与传统方法和策略迭代方法相比解决了解决最佳跟踪问题的方法,所提出的算法与系统数据运行而不是系统动态的知识。 为了执行所提出的算法,通过组装原始系统和参考轨迹构造辅助增强系统,而折扣因子被引入性能索引。 分析了所提出的算法的解决方案会聚到纳什均衡,结果不受探测噪声的影响。 提出了两种模拟以验证所提出的算法的可行性和有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号