High-Level Tracking of Autonomous Underwater Vehicles Based on Pseudo Averaged Q-Learning

机译：基于伪平均Q学习的自主水下航行器高层跟踪

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we investigate the trajectory tracking problem of underactuated autonomous underwater vehicles (AUVs) with input saturation. Our proposed model-free algorithm can realize high-level tracking control and stable learning by employing a novel actors-critics architecture, where a critic and multiple actors are learned to estimate the action-value function and deterministic policy, respectively. For the critic, Pseudo Averaged Q-learning, which is a simple extension to Q-learning, is proposed to calculate the target value, specifically, the action-value of next state is obtained by maximizing the average over the last multiple previous learned action-value estimates among all actors. As for the actors, deterministic policy gradient is applied to update the weights. The effectiveness and performance of the proposed Pseudo Averaged Q-learning based deterministic policy gradient (PAQ-DPG) algorithm is verified by implementation to an underactuated AUV. And the results demonstrate high-level tracking control accuracy and stability of learning of PAQ-DPG algorithm. Besides, under our proposed actors-critics framework, increasing the number of actors will further improve the performance.

机译：在本文中，我们研究了输入饱和度的欠下自主水下车辆（AUV）的轨迹跟踪问题。我们所提出的模型算法可以通过采用新颖的演员 - 批评者架构来实现高级跟踪控制和稳定学习，其中批评批评者和多个演员分别估算动作值函数和确定性政策。对于评论家，伪平均Q-Learning，这是一个简单的Q-Learning的扩展，是为了计算目标值，具体地，通过最大多个先前学习动作的平均值来获得下一个状态的动作值 - 所有演员之间的估计。至于演员，确定逻辑策略梯度用于更新权重。所提出的伪平均Q学基于Q学习的确定性政策梯度（PAQ-DPG）算法的有效性和性能是通过实现到欠锯AUV的验证。结果表明了PAQ-DPG算法的高电平跟踪控制精度和学习稳定性。此外，在我们提出的演员 - 批评者框架下，增加了演员的数量将进一步提高性能。

著录项

来源
《International Conference on Systems, Man, and Cybernetics》|2018年|4138-4143|共6页
会议地点
作者
Wenjie Shi; Shiji Song; Cheng Wu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Trajectory tracking; Reinforcement learning; Dynamics; Target tracking; Load modeling; Autonomous underwater vehicles; Automation;

机译：轨迹跟踪;强化学习;动力学;目标跟踪;载荷建模;自主水下航行器;自动化;

相似文献

外文文献
中文文献
专利

1. Multi Pseudo Q-Learning-Based Deterministic Policy Gradient for Tracking Control of Autonomous Underwater Vehicles [J] . Neural Networks and Learning Systems, IEEE Transactions on . 2019,第12期

机译：基于多伪Q学习的确定性梯度算法用于水下机器人的跟踪控制
2. Development of an Autonomous Surface Vehicle capable of tracking Autonomous Underwater Vehicles [J] . Braginsky Boris, Baruch Alon, Guterman Hugo Ocean Engineering . 2020,第Feba1期

机译：开发能够跟踪自动水下车辆的自主表面车辆
3. Trajectory tracking with quaternion-based attitude representation for autonomous underwater vehicle based on terminal sliding mode control [J] . X. Liu, M. Zhang, J. Chen, Oceanographic Literature Review . 2020,第10期

机译：基于终端滑动模式控制的自主水下车辆轨迹跟踪轨迹跟踪
4. High-Level Tracking of Autonomous Underwater Vehicles Based on Pseudo Averaged Q-Learning [C] . Wenjie Shi, Shiji Song, Cheng Wu IEEE International Conference on Systems, Man, and Cybernetics . 2018

机译：基于伪平均Q-Learning的自主水下车辆高级跟踪
5. Multiple vehicle coordination and cooperative estimation for target tracking with applications to autonomous underwater vehicle systems [D] . Triplett, Benjamin 2008

机译：用于目标跟踪的多车协调和协同估计，并应用于水下自动航行器系统
6. Optimization of the Energy Consumption of Depth Tracking Control Based on Model Predictive Control for Autonomous Underwater Vehicles [O] . Feng Yao, Chao Yang, Mingjun Zhang, 2019

机译：基于模型预测控制的自动水下航行器深度跟踪控制能耗优化
7. Target tracking control of underactuated autonomous underwater vehicle based on adaptive nonsingular terminal sliding mode control [O] . Jian Cao, Yushan Sun, Guocheng Zhang, 2020

机译：基于自适应非透射终端滑动模式控制的底层自治水下车辆的目标跟踪控制
8. Trajectory Design for Autonomous Underwater Vehicles Based on Ocean Model Predictions for Feature Tracking [R] . Smith, R. N., Chao, Y., Jones, B. H., 2009

机译：基于海洋模型预测的特征跟踪自主水下航行器弹道设计

High-Level Tracking of Autonomous Underwater Vehicles Based on Pseudo Averaged Q-Learning

摘要

著录项

相似文献

相关主题

期刊订阅