...
首页> 外文期刊>Mathematical Problems in Engineering >Decentralized Reinforcement Learning Robust Optimal Tracking Control for Time Varying Constrained Reconfigurable Modular Robot Based on ACI and Q-Function
【24h】

Decentralized Reinforcement Learning Robust Optimal Tracking Control for Time Varying Constrained Reconfigurable Modular Robot Based on ACI and Q-Function

机译:基于ACI和Q函数的时变约束可重构模块化机器人的分散式强化学习鲁棒最优跟踪控制

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

A novel decentralized reinforcement learning robust optimal tracking control theory for time varying constrained reconfigurable modular robots based on action-critic-identifier (ACI) and state-action value function (Q-function) has been presented to solve the problem of the continuous time nonlinear optimal control policy for strongly coupled uncertainty robotic system. The dynamics of time varying constrained reconfigurable modular robot is described as a synthesis of interconnected subsystem, and continuous time state equation and Q-function have been designed in this paper. Combining with ACI and RBF network, the global uncertainty of the subsystem and the HJB (Hamilton-Jacobi-Bellman) equation have been estimated, where critic-NN and action-NN are used to approximate the optimal Q-function and the optimal control policy, and the identifier is adopted to identify the global uncertainty as well as RBF-NN which is used to update the weights of ACI-NN. On this basis, a novel decentralized robust optimal tracking controller of the subsystem is proposed, so that the subsystem can track the desired trajectory and the tracking error can converge to zero in a finite time. The stability of ACI and the robust optimal tracking controller are confirmed by Lyapunov theory. Finally, comparative simulation examples are presented to illustrate the effectiveness of the proposed ACI and decentralized control theory.
机译:为了解决连续时间非线性问题,提出了一种基于动作批判符(ACI)和状态作用值函数(Q函数)的时变约束可重构模块化机器人的新型分散强化学习鲁棒最优跟踪控制理论。强耦合不确定性机器人系统的最优控制策略。将时变约束可重构模块化机器人的动力学描述为互连子系统的综合,并设计了连续时间状态方程和Q函数。结合ACI和RBF网络,估计了子系统的全局不确定性和HJB(Hamilton-Jacobi-Bellman)方程,其中使用评论家NN和动作NN来近似最优Q函数和最优控制策略,并且使用标识符来识别全局不确定性以及用于更新ACI-NN权重的RBF-NN。在此基础上,提出了一种新型的分散鲁棒子系统的最优跟踪控制器,使子系统可以跟踪期望的轨迹,并且跟踪误差可以在有限的时间内收敛到零。李亚普诺夫理论证实了ACI的稳定性和鲁棒的最优跟踪控制器。最后,通过比较仿真示例说明了所提出的ACI和分散控制理论的有效性。

著录项

  • 来源
    《Mathematical Problems in Engineering》 |2013年第16期|387817.1-387817.16|共16页
  • 作者

    Dong Bo; Li Yuanchun;

  • 作者单位

    Jilin Univ, Dept Commun Engn, Changchun 130022, Peoples R China.;

    Changchun Univ Technol, Dept Control Engn, Changchun 130012, Peoples R China.;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号