首页> 外文会议>Cognitive computing - ICCC 2018 >Reinforcement Learning with Monte Carlo Sampling in Imperfect Information Problems

【24h】

Reinforcement Learning with Monte Carlo Sampling in Imperfect Information Problems

机译：不完全信息问题中的蒙特卡洛采样强化学习

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Artificial intelligence is an approach that analyzes, studies, optimizes human strategies in challenging domains. Unlike perfect information problems, imperfect information problems usually present more complexity because the accuracy of conditions estimation cannot be effectively guaranteed. Thus, imperfect information problems need much more training data or much longer learning process when using supervised and unsupervised learning systems. This paper presents and evaluates a novel algorithm that based on Monte Carlo sampling as terminal states' estimation method in reinforce learning systems. The learning system calculates an adjusted result by novel algorithm in each iterations to smooth the fluctuation of imperfect information conditions. In this paper, we apply the new algorithm to build a deep neural network (DNN) learning system in our Texas Holdem poker game program. The contrast poker program has gained third rank in Annual Computer Poker Competition 2017 (ACPC 2017) and system with new approach shows better performance while convergence much faster.

机译：人工智能是一种在具有挑战性的领域中分析，研究，优化人类策略的方法。与完善的信息问题不同，不完善的信息问题通常会带来更多的复杂性，因为条件估计的准确性无法得到有效保证。因此，当使用有监督和无监督的学习系统时，不完美的信息问题需要更多的训练数据或更长的学习过程。本文提出并评估了一种新的算法，该算法基于蒙特卡洛采样作为强化学习系统中终端状态的估计方法。学习系统在每次迭代中通过新颖的算法计算调整后的结果，以消除不完美信息条件的波动。在本文中，我们将新算法应用到我们的德州扑克扑克游戏程序中来构建深度神经网络（DNN）学习系统。对比扑克程序在2017年年度计算机扑克竞赛（ACPC 2017）中获得第三名，采用新方法的系统显示出更好的性能，同时融合速度更快。

著录项

来源
《Cognitive computing - ICCC 2018》|2018年|55-67|共13页
会议地点 Seattle(US)
作者
Jiajia Zhang; Hong Liu;
展开▼
作者单位

Shenzhen Graduate School, Peking University, Shenzhen 518055, China;

Shenzhen Graduate School, Peking University, Shenzhen 518055, China;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Reinforcement learning; Monte Carlo sampling; Imperfect information;

机译：强化学习；蒙特卡洛采样；信息不完善;

相似文献

外文文献
中文文献
专利

1. Design of control framework based on deep reinforcement learning and Monte-Carlo sampling in downstream separation [J] . Soonho Hwangbo, Guerkan Sin Computers & Chemical Engineering . 2020,第Sepa2期

机译：基于深度加强学习的控制框架和蒙特卡罗在下游分离中抽样的设计
2. Reinforcement learning based optimal control of batch processes using Monte-Carlo deep deterministic policy gradient with phase segmentation [J] . Haeun Yoo, Boeun Kim, Jong Woo Kim, Computers & Chemical Engineering . 2021,第Jana4期

机译：基于跨越蒙特 - 卡洛深度确定性政策梯度的批量学习基于批处理流程的最优控制
3. Event Driven Duty Cycling with Reinforcement Learning and Monte Carlo Technique for Wireless Network [J] . Han Yao Huang, Tae-Jin Lee, Hee Yong Youn Mobile information systems . 2021,第a期

机译：事件驱动的免税与钢筋学习和无线网络技术的蒙特卡罗技术
4. Reinforcement Learning with Monte Carlo Sampling in Imperfect Information Problems [C] . Jiajia Zhang, Hong Liu International Conference on Cognitive Computing . 2018

机译：蒙特卡洛采样在不完美信息问题中的加固学习
5. A new population Monte Carlo method using data reinforcement. [D] . Luo, Xiaoxian. 2007

机译：一种新的使用数据增强的蒙特卡洛方法。
6. Towards efficient discovery of green synthetic pathways with Monte Carlo tree search and reinforcement learning [O] . Xiaoxue Wang, Yujie Qian, Hanyu Gao, 2020

机译：朝着蒙特卡罗树搜索和加固学习有效发现绿色综合途径
7. A reinforcement learning application of a guided Monte Carlo Tree Search algorithm for beam orientation selection in radiation therapy [O] . Azar Sadeghnejad-Barkousaraie, Gyanendra Bohara, Steve Jiang, 2021

机译：引导蒙特卡罗树搜索算法在放射治疗中光束方向选择的加强学习应用
8. Evaluation of sampling plans for in-service inspection of steam generator tubes. Volume 2, Comprehensive analytical and Monte Carlo simulation results for several sampling plans [R] . 1994

机译：评估蒸汽发生器管在役检查的抽样计划。第2卷，多个抽样计划的综合分析和蒙特卡罗模拟结果

Reinforcement Learning with Monte Carlo Sampling in Imperfect Information Problems

摘要

著录项

相似文献

相关主题

期刊订阅