首页> 外文期刊>Soft Computing >An Adaptive Actor-critic Algorithm with Multi-step Simulated Experiences for Controlling Nonholonomic Mobile Robots
【24h】

An Adaptive Actor-critic Algorithm with Multi-step Simulated Experiences for Controlling Nonholonomic Mobile Robots

机译:具有多步模拟经验的自适应Actor-Crit算法,用于控制非完整移动机器人

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we propose a new algorithm of an adaptive actor-critic method with multi-step simulated experiences, as a kind of temporal difference (TD) method. In our approach, the TD-error is composed of two value- functions and m utility functions, where m denotes the number of multi-steps in which the experience should be simulated. The value-function is constructed from the critic formulated by a radial basis function neural network (RBFNN), which has a simulated experience as an input, generated from a predictive model based on a kinematic model. Thus, since our approach assumes that the model is available to simulate the m-step experiences and to design a controller, such a kinematic model is also applied to construct the actor and the resultant model based actor (MBA) is also regarded as a network, i.e., it is just viewed as a resolved velocity control network. We implement this approach to control nonholonomic mobile robot, especially in a trajectory tracking control problem for the position coordinates and azimuth. Some simulations show the effectiveness of the proposed method for controlling a mobile robot with two-independent driving wheels.
机译:在本文中,我们提出了一种具有多步仿真经验的自适应演员批判方法的新算法,作为一种时差(TD)方法。在我们的方法中,TD误差由两个值函数和m个效用函数组成,其中m表示应模拟经验的多步数。价值函数由径向基函数神经网络(RBFNN)制定的注释器构造而成,该函数具有模拟输入经验,是基于运动学模型的预测模型生成的。因此,由于我们的方法假设该模型可用于模拟m步经验并设计控制器,因此这种运动学模型也可用于构造参与者,并且基于结果的参与者(MBA)模型也被视为网络,即,它只是被视为分解的速度控制网络。我们实施这种方法来控制非完整移动机器人,尤其是在位置坐标和方位角的轨迹跟踪控制问题中。一些仿真显示了所提出的方法用于控制带有两个独立驱动轮的移动机器人的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号