首页> 外文期刊>Diffusion and Defect Data. Solid State Data, Part B. Solid State Phenomena >Discrete Action Dependant Heuristic Dynamic Programming in control of a wheeled mobile robot
【24h】

Discrete Action Dependant Heuristic Dynamic Programming in control of a wheeled mobile robot

机译:轮式移动机器人控制中的离散动作依赖启发式动态规划

获取原文
获取原文并翻译 | 示例
           

摘要

In presented paper we propose a discrete tracking control algorithm for a two-wheeled mobile robot. The control algorithm consists of discrete Adaptive Critic Design (ACD) in Action Dependant Heuristic Dynamic Programming (ADHDP) configuration, PD controller and a supervisory term, derived from the Lyapunov stability theorem and based on the variable structure systems theory. Adaptive Critic Designs are a group of algorithms that use two independent structures for estimation of optimal value function from Bellman equation and estimation of optimal control law. ADHDP algorithm consists of Actor (ASE - Associate Search Element) that estimates the optimal control law and Critic (ACE - Adaptive Critic Element) that evaluates quality of control by estimation of the optimal value function from Bellman equation. Both structures are realized in a form of Neural Networks (NN). ADHDP algorithm does not require a plant model (the wheeled mobile robot (WMR) model) for ACE or ASE neural network weights update procedure (in contrast with other ACD configurations e.g. Heuristic Dynamic Programming or Dual Heuristic Programming that use the plant model). In presented control algorithm Actor-Critic structure is supported by PD controller and the supervisory term, that guarantee stable implementation of tracking in an initial adaptive critic neural networks learning phase, and robustness in a face of disturbances. Verification of proposed control algorithm was realized on the two-wheeled mobile robot Pioneer-2DX.
机译:在提出的论文中,我们提出了一种用于两轮移动机器人的离散跟踪控制算法。该控制算法由基于Lyapunov稳定性定理并基于变结构系统理论的基于动作依赖启发式动态规划(ADHDP)配置的离散自适应关键设计(ACD),PD控制器和一个监控项组成。自适应批判设计是一组算法,这些算法使用两个独立的结构根据Bellman方程估算最佳值函数并估算最佳控制律。 ADHDP算法由估计最佳控制律的Actor(ASE-关联搜索元素)和Critic(ACE-自适应关键元素)组成,后者通过根据Bellman方程估计最佳值函数来评估控制质量。两种结构都以神经网络(NN)的形式实现。 ADHDP算法不需要用于ACE或ASE神经网络权重更新过程的工厂模型(轮式移动机器人(WMR)模型)(与其他ACD配置(例如使用工厂模型的启发式动态编程或双重启发式编程相反)。在提出的控制算法中,Actor-Critic结构由PD控制器和监督术语支持,从而保证了在初始自适应批评者神经网络学习阶段的跟踪稳定实施,并且在遇到干扰时具有鲁棒性。在两轮移动机器人Pioneer-2DX上实现了所提出的控制算法的验证。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号