首页> 外文会议>Cognitive computing - ICCC 2018 >Reinforcement Learning with Monte Carlo Sampling in Imperfect Information Problems
【24h】

Reinforcement Learning with Monte Carlo Sampling in Imperfect Information Problems

机译:不完全信息问题中的蒙特卡洛采样强化学习

获取原文
获取原文并翻译 | 示例

摘要

Artificial intelligence is an approach that analyzes, studies, optimizes human strategies in challenging domains. Unlike perfect information problems, imperfect information problems usually present more complexity because the accuracy of conditions estimation cannot be effectively guaranteed. Thus, imperfect information problems need much more training data or much longer learning process when using supervised and unsupervised learning systems. This paper presents and evaluates a novel algorithm that based on Monte Carlo sampling as terminal states' estimation method in reinforce learning systems. The learning system calculates an adjusted result by novel algorithm in each iterations to smooth the fluctuation of imperfect information conditions. In this paper, we apply the new algorithm to build a deep neural network (DNN) learning system in our Texas Holdem poker game program. The contrast poker program has gained third rank in Annual Computer Poker Competition 2017 (ACPC 2017) and system with new approach shows better performance while convergence much faster.
机译:人工智能是一种在具有挑战性的领域中分析,研究,优化人类策略的方法。与完善的信息问题不同,不完善的信息问题通常会带来更多的复杂性,因为条件估计的准确性无法得到有效保证。因此,当使用有监督和无监督的学习系统时,不完美的信息问题需要更多的训练数据或更长的学习过程。本文提出并评估了一种新的算法,该算法基于蒙特卡洛采样作为强化学习系统中终端状态的估计方法。学习系统在每次迭代中通过新颖的算法计算调整后的结果,以消除不完美信息条件的波动。在本文中,我们将新算法应用到我们的德州扑克扑克游戏程序中来构建深度神经网络(DNN)学习系统。对比扑克程序在2017年年度计算机扑克竞赛(ACPC 2017)中获得第三名,采用新方法的系统显示出更好的性能,同时融合速度更快。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号