Practical Open-Loop Optimistic Planning

机译：实用的开放式乐观计划

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider the problem of online planning in a Markov Decision Process when given only access to a generative model, restricted to open-loop policies - i.e. sequences of actions - and under budget constraint. In this setting, the Open-Loop Optimistic Planning (OLOP) algorithm enjoys good theoretical guarantees but is overly conservative in practice, as we show in numerical experiments. We propose a modified version of the algorithm with tighter upper-confidence bounds, KL-OLOP, that leads to better practical performances while retaining the sample complexity bound. Finally, we propose an efficient implementation that significantly improves the time complexity of both algorithms.

机译：当只允许访问生成模型，受限于开环政策（即一系列行动）且受预算约束时，我们会考虑马尔可夫决策过程中的在线计划问题。在这种情况下，如数值实验所示，开环乐观规划（OLOP）算法具有良好的理论保证，但在实践中过于保守。我们提出了具有更严格的置信上限的算法的修改版本KL-OLOP，它在保留样本复杂度范围的同时，可以带来更好的实用性能。最后，我们提出了一种有效的实现方式，可以显着提高两种算法的时间复杂度。

著录项

来源
《European conference on machine learning and principles and practice of knowledge discovery in databases》|2019年|69-85|共17页
会议地点
作者
Edouard Leurent; Odalric-Ambrym Maillard;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Planning; Online learning; Tree search;

机译：规划;在线学习;树搜索;

相似文献

外文文献
中文文献
专利

1. When Is Difficult Planning Good Planning? The Effects of Scenario-Based Planning on Optimistic Prediction Bias [J] . Min K.S., Arkes H.R. Journal of applied social psychology . 2012,第11期

机译：什么时候难以计划好计划？基于场景的计划对乐观预测偏差的影响
2. Open-Loop Beamforming Technique for MIMO System and Its Practical Realization [J] . PeerapongUthansakul, ApinyaInnok, MonthippaUthansakul International journal of antennas and propagation . 2011,第2期

机译：MIMO系统的开环波束成形技术及其实际实现
3. Open-Loop Beamforming Technique for MIMO System and Its Practical Realization [J] . Peerapong Uthansakul, Apinya Innok, Monthippa Uthansakul International journal of antennas and propagation . 2011,第1期

机译：MIMO系统的开环波束成形技术及其实际实现
4. Balance between optimistic planning and pessimistic planning in a mission critical project [C] . Fujisawa H., Sako H. Engineering Management Conference, 2003. IEMC '03. Managing Technologically Driven Organizations: The Human Side of Innovation and Change . 2003

机译：关键任务项目中的乐观计划与悲观计划之间的平衡
5. Essays in energy policy and planning modeling under uncertainty: Value of information, optimistic biases, and simulation of capacity markets. [D] . Hu, Ming-Che. 2009

机译：不确定性下的能源政策和规划模型论文：信息价值，乐观偏见和能力市场模拟。
6. Stochastic optimal open-loop control as a theory of force and impedance planning via muscle co-contraction [O] . Bastien Berret, Frédéric Jean 2020

机译：随机最优开环控制作为通过肌肉共收缩进行力和阻抗计划的理论
7. Practical Open-Loop Optimistic Planning [O] . Edouard Leurent, Odalric-Ambrym Maillard 2020

机译：实用开放乐观规划

Practical Open-Loop Optimistic Planning

摘要

著录项

相似文献

相关主题

期刊订阅