...
首页> 外文期刊>IFAC PapersOnLine >Data-driven dynamic multi-objective optimal control: A Hamiltonian-inequality driven satisficing reinforcement learning approach
【24h】

Data-driven dynamic multi-objective optimal control: A Hamiltonian-inequality driven satisficing reinforcement learning approach

机译:数据驱动的动态多目标最优控制:汉密尔顿 - 不等式驱动符合增强型研究方法

获取原文
           

摘要

This paper presents an iterative data-driven algorithm for solving dynamic multi-objective (MO) optimal control problems arising in control of nonlinear continuous-time systems with multiple objectives. It is first shown that the Hamiltonian function corresponding to each objective can serve as a comparison function to compare the performance of admissible policies. Relaxed Hamilton-Jacobi-bellman (HJB) equations in terms of HJB inequalities are then solved in a dynamic constrained MO framework to find Pareto-optimal solutions. Relation to satisficing (good enough) decision-making framework is shown. A Sum-of-Square (SOS)-based iterative algorithm is developed to solve the formulated MO optimization with HJB inequalities. To obviate the requirement of complete knowledge of the system dynamics, a data-driven satisficing reinforcement learning approach is proposed to solve the SOS optimization problem in real-time using only the information of the system trajectories measured during a time interval without having full knowledge of the system dynamics. Finally, a simulation example is provided to show the effectiveness of the proposed algorithm.
机译:本文介绍了一种迭代数据驱动算法,用于求解具有多个目标的非线性连续时间系统的控制中出现的动态多目标(MO)最优控制问题。首先表明,对应于每个目标的哈密顿函数可以用作比较可允许策略性能的比较函数。在动态约束的MO框架中,可以解决在HJB不平等方面的哈米尔顿 - 雅各 - 贝尔曼(HJB)方程,以找到帕累托最优解决方案。与满足(足够好)决策框架的关系显示。基于广场(SOS)的迭代算法,以解决与HJB不等式的配制MO优化。为了避免系统动态的完整知识的要求,提出了一种数据驱动的满足增强学习方法,以实时解决SOS优化问题,仅使用时间间隔期间测量的系统轨迹的信息而没有完全了解系统动态。最后,提供了模拟示例以显示所提出的算法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号