【24h】

Prioritizing Bellman Backups Without a Priority Queue

机译:对没有优先级队列的Bellman备份进行优先级排序

获取原文
获取原文并翻译 | 示例

摘要

Several researchers have shown that the efficiency of value iteration, a dynamic programming algorithm for Markov decision processes, can be improved by prioritizing the order of Bellman backups to focus computation on states where the value function can be improved the most. In previous work, a priority queue has been used to order backups. Although this incurs overhead for maintaining the priority queue, previous work has argued that the overhead is usually much less than the benefit from prioritization. However this conclusion is usually based on a comparison to a non-prioritized approach that performs Bellman backups on states in an arbitrary order. In this paper, we show that the overhead for maintaining the priority queue can be greater than the benefit, when it is compared to very simple heuristics for prioritizing backups that do not require a priority queue. Although the order of backups induced by our simple approach is often sub-optimal, we show that its smaller overhead allows it to converge faster than other state-of-the-art priority-based solvers.
机译:一些研究人员表明,通过优先考虑Bellman备份的顺序,将计算的重点放在可以最大程度改善价值函数的状态上,可以提高价值迭代的效率(一种用于Markov决策过程的动态编程算法)。在以前的工作中,优先级队列已用于订购备份。尽管这会增加维护优先级队列的开销,但先前的工作表明,开销通常远小于优先级分配带来的好处。但是,该结论通常基于与非优先方法的比较,该非优先方法以任意顺序对状态执行Bellman备份。在本文中,我们表明,与非常简单的启发式方法对不需要优先级队列的备份进行优先级排序相比,维护优先级队列的开销可能会大于收益。尽管通过我们的简单方法得出的备份顺序通常不是最佳的,但是我们证明了其较小的开销使其可以比其他基于优先级的最新求解器收敛更快。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号