首页> 外文会议>Uncertainty in artificial intelligence >Learning is planning: near Bayes-optimal reinforcement learning via Monte-Carlo tree search

【24h】

Learning is planning: near Bayes-optimal reinforcement learning via Monte-Carlo tree search

机译：正在计划学习：在贝叶斯附近通过蒙特卡洛树搜索进行最佳强化学习

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Bayes-optimal behavior, while well-defined, is often difficult to achieve. Recent advances in the use of Monte-Carlo tree search (MCTS) have shown that it is possible to act near-optimally in Markov Decision Processes (MDPs) with very large or infinite state spaces. Bayes-optimal be havior in an unknown MDP is equivalent to op timal behavior in the known belief-space MDP, although the size of this belief-space MDP grows exponentially with the amount of history re tained, and is potentially infinite. We show how an agent can use one particular MCTS algorithm, Forward Search Sparse Sampling (FSSS), in an efficient way to act nearly Bayes-optimally for all but a polynomial number of steps, assuming that FSSS can be used to act efficiently in any possible underlying MDP.

机译：贝叶斯最佳行为虽然定义明确，但通常难以实现。蒙特卡洛树搜索（MCTS）的使用的最新进展表明，在状态空间很大或无限的马尔可夫决策过程（MDP）中，可以以接近最佳的方式进行操作。尽管已知空间中的贝叶斯最优行为与保留的历史记录数量成指数增长，并且可能是无限的，但在未知的MDP中贝叶斯最优行为等效于已知的信念空间MDP中的最优行为。我们展示了一个代理如何可以有效地使用一种特定的MCTS算法（正向搜索稀疏采样（FSSS））来对除多项式之外的所有步骤进行近似贝叶斯优化的操作，假设FSSS可用于任何步骤可能的基础MDP。

著录项

来源
《Uncertainty in artificial intelligence》|2011年|p.19-26|共8页
会议地点 Barcelona(ES);Barcelona(ES)
作者
John Asmuth; Michael Littman;
展开▼
作者单位

Department of Computer Science Rutgers University Piscataway, NJ 08854;

Department of Computer Science Rutgers University Piscataway, NJ 08854;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Monte-Carlo tree search for Bayesian reinforcement learning [J] . Ngo Anh Vien, Wolfgang Ertel, Viet-Hung Dang, Applied Intelligence . 2013,第2期

机译：蒙特卡洛树搜索用于贝叶斯强化学习
2. Monte-Carlo tree search for Bayesian reinforcement learning [J] . Vien N.A., Ertel W., Dang V.-H., Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2013,第2期

机译：蒙特卡罗树搜索以进行贝叶斯强化学习
3. Scalable and Efficient Bayes-Adaptive Reinforcement Learning Based on Monte-Carlo Tree Search [J] . Dayan P., Guez A., Silver D. The Journal of Artificial Intelligence Research . 2013,第12期

机译：基于蒙特卡洛树搜索的可扩展高效贝叶斯自适应强化学习
4. Learning is planning: near Bayes-optimal reinforcement learning via Monte-Carlo tree search [C] . John Asmuth, Michael Littman Conference on Uncertainty in Artificial Intelligence . 2011

机译：学习是计划：靠近贝叶斯 - 通过Monte-Carlo树搜索靠近最佳钢筋
5. Adaptive Bayes-Optimal Methods for Stochastic Search with Applications to Preference Learning [D] . Pallone, Stephen N. 2017

机译：随机搜索的自适应贝叶斯最优方法及其在偏好学习中的应用
6. Towards efficient discovery of green synthetic pathways with Monte Carlo tree search and reinforcement learning [O] . Xiaoxue Wang, Yujie Qian, Hanyu Gao, 2020

机译：朝着蒙特卡罗树搜索和加固学习有效发现绿色综合途径
7. Monte-Carlo Tree Search and Reinforcement Learning for Reconfiguring Data Stream Processing on Edge Computing [O] . Alexandre da Silva Veith, Marcos Dias de Assuncao, Laurent Lefevre 2019

机译：Monte-Carlo树搜索和加固学习，用于重新配置边缘计算数据流处理

Learning is planning: near Bayes-optimal reinforcement learning via Monte-Carlo tree search

摘要

著录项

相似文献

相关主题

期刊订阅