首页> 外文期刊>International journal of computational vision and robotics >Deep reinforcement learning collision avoidance using policy gradient optimisation and Q-learning
【24h】

Deep reinforcement learning collision avoidance using policy gradient optimisation and Q-learning

机译:使用政策梯度优化和Q-Learning避免深增强学习碰撞

获取原文
获取原文并翻译 | 示例
           

摘要

Usage of trust region policy optimisation (TRPO) and proximal policy optimisation (PPO) 'children of policy gradient optimisation method' and deep Q-learning network (DQN) in Lidar-based differential robots are proposed using Turtlebot and OpenAI's baselines optimisation methods. The simulation results proved that the three algorithms are ideal for obstacle avoidance and robot navigation with the utter advantage for TRPO and PPO in complex environments. The used policies can be used in a fully decentralised manner as the learned policy is not constrained by any robot parameters or communication protocols.
机译:利用Turtlebot和Openai基线优化方法提出了使用Turtebot基础差动机器人的信托区域政策优化(TRPO)和近端政策优化(PPO)“近Q学习网络(DQN)的儿童。仿真结果证明,三种算法对于复杂环境中的TRPO和PPO的彻底优势,这三种算法是障碍物避免和机器人导航。使用的策略可以以完全分散的方式使用,因为学习的策略不受任何机器人参数或通信协议的限制。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号