首页> 外文会议> >Modification of Q-learning to Adapt to the Randomness of Environment
【24h】

Modification of Q-learning to Adapt to the Randomness of Environment

机译:修改Q学习以适应环境的随机性

获取原文

摘要

Q-learning is a typical model-free algorithm in reinforcement learning to achieve a goal by interacting with an uncertain environment. However, conventional Q-learning cannot reach convergence and even learns bad policies when the state transition and the immediate reward of the environment are randomly distributed. This paper gives a modification of the Q-learning algorithm by exploring a Monte Carlo method to settle the above problems. Furthermore, simulation experiments are performed to validate the modified Q-learning algorithm.
机译:Q学习是强化学习中的一种典型的无模型算法,可通过与不确定的环境进行交互来实现目标。然而,当状态转换和环境的立即奖励随机分布时,常规的Q学习无法达到收敛甚至学习不好的策略。通过探索蒙特卡洛方法来解决上述问题,本文对Q学习算法进行了修改。此外,进行仿真实验以验证改进的Q学习算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号