首页> 外文会议>IEEE International Conference on Software Engineering and Service Science >Orthogonal Policy Gradient and Autonomous Driving Application
【24h】

Orthogonal Policy Gradient and Autonomous Driving Application

机译:正交政策梯度与自主驾驶应用

获取原文
获取外文期刊封面目录资料

摘要

One less addressed issue of deep reinforcement learning is the lack of generalization capability based on new state and new target, for complex tasks, it is necessary to give the correct strategy and evaluate all possible actions for current state. Fortunately, deep reinforcement learning has enabled enormous progress in both subproblems: giving the correct strategy and evaluating all actions based on the state. In this paper we present an approach called orthogonal policy gradient descent (OPGD) that can make agent learn the policy gradient based on the current state and the actions set, by which the agent can learn a policy network with generalization capability. we evaluate the proposed method on the 3D autonomous driving enviroment TORCS compared with the baseline model, detailed analyses of experimental results and proofs are also given.
机译:深度加强学习的一个较少的解决问题是基于新的状态和新目标缺乏概括能力,对于复杂的任务,有必要提供正确的策略并评估当前状态的所有可能的行动。幸运的是,深度加强学习在副本中已经启用了巨大进展:给出了正确的策略并根据国家评估所有行动。在本文中,我们提出了一种称为正交策略梯度下降(OPGD)的方法,该方法可以使代理基于当前状态和操作集的策略梯度,由此代理可以通过泛化能力学习策略网络。与基线模型相比,我们评估了在3D自主驱动环境TORC上的提出的方法,还给出了实验结果和证据的详细分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号