首页> 外文会议>Intelligent Vehicles Symposium, 2005. Proceedings. IEEE >Double action Q-learning for obstacle avoidance in a dynamically changing environment
【24h】

Double action Q-learning for obstacle avoidance in a dynamically changing environment

机译:双动Q学习可在动态变化的环境中避开障碍物

获取原文

摘要

In this paper, we propose a new method for solving the reinforcement learning problem in a dynamically changing environment, as in vehicle navigation, in which the Markov decision process used in traditional reinforcement learning is modified so that the response of the environment is taken into consideration for determining the agent's next state. This is achieved by changing the action-value function to handle three parameters at a time, namely, the current state, action taken by the agent, and action taken by the environment. As it considers the actions by the agent and environment, it is termed "double action". Based on the Q-learning method, the proposed method is implemented and the update rule is modified to handle all of the three parameters. Preliminary results show that the proposed method has the sum of rewards (negative) 89.5% less than that of the traditional method. Apart from that, our new method also has the total number of collisions and mean steps used in one episode 89.5% and 15.5% lower than that of the traditional method respectively.
机译:在本文中,我们提出了一种新的方法来解决动态变化的环境中的强化学习问题,例如在车辆导航中,其中修改了传统强化学习中使用的马尔可夫决策过程,从而考虑了环境的响应用于确定代理的下一个状态。这是通过更改操作值函数以一次处理三个参数来实现的,即当前状态,代理采取的操作和环境采取的操作。由于它考虑了代理和环境的行为,因此被称为“双重行为”。基于Q学习方法,实现了所提出的方法,并修改了更新规则以处理所有这三个参数。初步结果表明,该方法的奖励总和(负数)比传统方法少89.5%。除此之外,我们的新方法一次发生的碰撞总数和平均步数分别比传统方法低89.5%和15.5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号