Double action Q-learning for obstacle avoidance in a dynamically changing environment

机译：双动Q学习可在动态变化的环境中避开障碍物

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper, we propose a new method for solving the reinforcement learning problem in a dynamically changing environment, as in vehicle navigation, in which the Markov decision process used in traditional reinforcement learning is modified so that the response of the environment is taken into consideration for determining the agent's next state. This is achieved by changing the action-value function to handle three parameters at a time, namely, the current state, action taken by the agent, and action taken by the environment. As it considers the actions by the agent and environment, it is termed "double action". Based on the Q-learning method, the proposed method is implemented and the update rule is modified to handle all of the three parameters. Preliminary results show that the proposed method has the sum of rewards (negative) 89.5% less than that of the traditional method. Apart from that, our new method also has the total number of collisions and mean steps used in one episode 89.5% and 15.5% lower than that of the traditional method respectively.

机译：在本文中，我们提出了一种新的方法来解决动态变化的环境中的强化学习问题，例如在车辆导航中，其中修改了传统强化学习中使用的马尔可夫决策过程，从而考虑了环境的响应用于确定代理的下一个状态。这是通过更改操作值函数以一次处理三个参数来实现的，即当前状态，代理采取的操作和环境采取的操作。由于它考虑了代理和环境的行为，因此被称为“双重行为”。基于Q学习方法，实现了所提出的方法，并修改了更新规则以处理所有这三个参数。初步结果表明，该方法的奖励总和（负数）比传统方法少89.5％。除此之外，我们的新方法一次发生的碰撞总数和平均步数分别比传统方法低89.5％和15.5％。

著录项

来源
《Intelligent Vehicles Symposium, 2005. Proceedings. IEEE》|2005年|P.211-216|共6页
会议地点
作者
Ngai D.C.K.; Yung N.H.C.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类工业技术;
关键词
Markov processes; automated highways; collision avoidance; decision theory; learning (artificial intelligence); problem solving; Markov decision process; action-value function; double action Q-learning; dynamically changing environment; obstacle avoidance; reinforc;

机译：马尔可夫过程;自动公路;防撞;决策理论;学习（人工智能）;问题解决;马尔可夫决策过程;作用值函数;双重行动Q学习;动态变化的环境;避障;再咨询;

相似文献

外文文献
中文文献
专利

1. Autonomous quadrotor obstacle avoidance based on dueling double deep recurrent Q-learning with monocular vision [J] . Ou Jiajun, Guo Xiao, Zhu Ming, Neurocomputing . 2021,第Juna21期

机译：基于Dueling Double Deak Readurent Q-Learning的自主四脉冲障碍避免，单眼视觉
2. The Q-learning obstacle avoidance algorithm based on EKF-SLAM for NAO autonomous walking under unknown environments [J] . Wen Shuhuan, Chen Xiao, Ma Chunli, Robotics and Autonomous Systems . 2015,第Null期

机译：基于EKF-SLAM的未知环境下NAO自主行走的Q学习避障算法
3. A real-time framework for kinodynamic planning in dynamic environments with application to quadrotor obstacle avoidance [J] . Allen Ross E., Pavone Marco Robotics and Autonomous Systems . 2019,第期

机译：具有应用于Quadrotor障碍避免的动态环境中的Kinodynamic规划的实时框架
4. Performance evaluation of double action Q-learning in moving obstacle avoidance problem [C] . Ngai, D.C.K., Yung, . 2005

机译：双动Q学习在移动避障中的性能评估
5. Goal oriented navigation and obstacle avoidance in semi-dynamic structured environments using sensor fusion [D] . Robertson, Jason D. 2010

机译：使用传感器融合的半动态结构化环境中的面向目标的导航和避障
6. Double Deep Q-Learning and Faster R-CNN-Based Autonomous Vehicle Navigation and Obstacle Avoidance in Dynamic Environment [O] . Razin Bin Issa, Modhumonty Das, Md. Saferi Rahman, 2021

机译：双层Q-Learning和更快的R-CNN自主车辆导航和动态环境中的避难
7. Double action Q-learning for obstacle avoidance in a dynamically changing environment [O] . Yung NHC, Ngai DCK 2005

机译：在动态变化的环境中避免障碍的双重动作Q学习

Double action Q-learning for obstacle avoidance in a dynamically changing environment

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅