基于深度强化学习的移动机器人路径规划

DONG Yao; GE Yingying; GUO Hongyong; DONG Yongfeng; YANG Chen

首页> 中文期刊> 《计算机工程与应用》 >基于深度强化学习的移动机器人路径规划

基于深度强化学习的移动机器人路径规划

AI论文写作 >>

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

To solve the problem of slow convergence under the basic deep Q-Network with which the robot explores the complex and unknown environment, an improved deep double Q network algorithm(Improved Dueling Deep Double Q-Network, IDDDQN)based on dueling network structure is put forward. The mobile robot can estimate the state-action value function of its three actions through the improved DDQN network, update the network parameters and get the corresponding Q value through the training. With the combination of Boltzmann and ε-greedy adopted, the mobile robot chooses an optimal action, and reaches the next observation. It can also store the data into experience replay memory through network learning, and train the network with mini-batch data. According to the experiment results, the mobile robot using IDDDQN can quickly adapt to the unknown environment, the convergence speed of IDDDQN is improved, the success rate of reaching the target position adds up to more than three times, and the optimal path can also be gained in an unknown complex environment.%为解决传统的深度Q网络模型下机器人探索复杂未知环境时收敛速度慢的问题,提出了基于竞争网络结构的改进深度双Q 网络方法(Improved Dueling Deep Double Q-Network,IDDDQN).移动机器人通过改进的DDQN网络结构对其三个动作的值函数进行估计,并更新网络参数,通过训练网络得到相应的Q值.移动机器人采用玻尔兹曼分布与ε-greedy相结合的探索策略,选择一个最优动作,到达下一个观察.机器人将通过学习收集到的数据采用改进的重采样优选机制存储到缓存记忆单元中,并利用小批量数据训练网络.实验结果显示,与基本DDQN算法比,IDDDQN训练的机器人能够更快地适应未知环境,网络的收敛速度也得到提高,到达目标点的成功率增加了3倍多,在未知的复杂环境中可以更好地获取最优路径.

著录项

来源
《计算机工程与应用》 |2019年第13期|15-19,157|共6页
作者
DONG Yao; GE Yingying; GUO Hongyong; DONG Yongfeng; YANG Chen;
展开▼
作者单位

School of Artificial Intelligence;

Hebei University of Technology;

Tianjin 300401;

China 2.Hebei Provincial Key Laboratory of Big Data Computing;

Hebei University of Technology;

Tianjin 300401;

China 3.Hebei University of Engineering;

Handan;

Hebei 056038;

China;

School of Artificial Intelligence;

Hebei University of Technology;

Tianjin 300401;

China 2.Hebei Provincial Key Laboratory of Big Data Computing;

Hebei University of Technology;

Tianjin 300401;

China 3.Hebei University of Engineering;

Handan;

Hebei 056038;

China;

School of Artificial Intelligence;

Hebei University of Technology;

Tianjin 300401;

China 2.Hebei Provincial Key Laboratory of Big Data Computing;

Hebei University of Technology;

Tianjin 300401;

China 3.Hebei University of Engineering;

Handan;

Hebei 056038;

China;

School of Artificial Intelligence;

Hebei University of Technology;

Tianjin 300401;

China 2.Hebei Provincial Key Laboratory of Big Data Computing;

Hebei University of Technology;

Tianjin 300401;

China 3.Hebei University of Engineering;

Handan;

Hebei 056038;

China;

School of Artificial Intelligence;

Hebei University of Technology;

Tianjin 300401;

China 2.Hebei Provincial Key Laboratory of Big Data Computing;

Hebei University of Technology;

Tianjin 300401;

China 3.Hebei University of Engineering;

Handan;

Hebei 056038;

China;

展开▼
原文格式 PDF
正文语种 chi
中图分类在其他方面的应用;
关键词
深度双Q网络(DDQN); 竞争网络结构; 重采样优选机制; 玻尔兹曼分布; ε-greedy策略;

相似文献

中文文献
外文文献
专利

1. 基于深度强化学习和动态窗口法的移动机器人路径规划 [J] . 王鹏凯 ,梁中华 ,杨阔 . 计算机与数字工程 . 2021,第010期
2. 改进深度强化学习的室内移动机器人路径规划 [J] . 成怡 ,郝密密 . 计算机工程与应用 . 2021,第021期
3. 基于深度强化学习和人工势场法的移动机器人导航 [J] . 陈满 ,李茂军 ,李宜伟 . 云南大学学报:自然科学版 . 2021,第6期
4. 基于深度强化学习的移动机器人轨迹跟踪和动态避障 [J] . 吴运雄 ,曾碧 . 广东工业大学学报 . 2019,第001期
5. 基于深度强化学习的移动机器人导航策略研究 [J] . 江其洲 ,曾碧 . 计算机测量与控制 . 2019,第008期
6. 基于改进A*算法的移动机器人路径规划 [C] . Tengfei Chen ,陈腾飞 ,Zhongliang Deng . 第十二届中国卫星导航年会 . 2018
7. 基于深度强化学习的移动机器人路径规划研究 [A] . 齐昊罡 . 2021

基于深度强化学习的移动机器人路径规划

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅