首页> 外文会议>International Conference on Control, Decision and Information Technologies >Adaptive Dynamic Programming Based Motion Control of Autonomous Underwater Vehicles
【24h】

Adaptive Dynamic Programming Based Motion Control of Autonomous Underwater Vehicles

机译:基于自适应动态规划的自主水下航行器运动控制

获取原文

摘要

In this paper, Adaptive Dynamic Programming (ADP) technique is utilized to achieve optimal motion control of Autonomous Underwater Vehicle (AUV) System. The paper proposes a model-free based method that takes into consideration the actuator input and obstacle position while tracing an optimal path. The concept of machine learning enables to develop a path-planner which aims to avoid collisions with static obstacles. The ADP approach is realized to approximate the solution of the cost functional for optimization purpose by which the positions of the locally situated obstacles need not be priori-known until they are within a designed approximation safety envelope. The methodology is implemented to achieve the path-planning objective using dynamic programming technique. The Least-squares policy method serves as a recursive algorithm to approximate the value function for the domain, providing an approach for the finite space discrete control system. The concept behind the design of an obstacle-free path finder is to generate an optimal action that minimizes the local cost, defined by a functional, under constrained optimization. The most advantageous value function is described by the Hamilton Jacobi Bellman (HJB) equation, that is impractical to solve using analytical methods. To overcome the complex calculations subject to HJB, a method based on Reinforcement Learning (RL), called ADP is implemented. This paper outlines the concept of machine learning to realize a real time obstacle avoidance system.
机译:本文利用自适应动态规划(ADP)技术来实现自主水下车辆(AUV)系统的最佳运动控制。本文提出了一种基于模型的方法,其考虑了追踪最佳路径的同时考虑了执行器输入和障碍物位置。机器学习的概念可以开发一个路径规划者,旨在避免与静态障碍的碰撞。 ADP方法实现以近似于优化目的的成本函数的解决方案,通过该优化目的,在它们在设计的近似安全包络内之前不需要先知道局部位置障碍物的位置。实施方法以实现使用动态编程技术实现路径规划目标。最小二乘策略方法用作递归算法,以近似域的值函数,为有限空间离散控制系统提供一种方法。无障碍路径查找器的设计背后的概念是生成最佳动作,可在约束优化下最小化由功能定义的本地成本。最有利的价值函数是由Hamilton Jacobi Bellman(HJB)方程描述的,即使用分析方法解决是不切实际的。为了克服HJB的复杂计算,实现了一种称为ADP的基于加强学习(RL)的方法。本文概述了机器学习的概念,实现了实时障碍避免系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号