首页> 外文会议>European conference on machine learning and knowledge discovery in databases;ECML PKDD 2011 >Lagrange Dual Decomposition for Finite Horizon Markov Decision Processes
【24h】

Lagrange Dual Decomposition for Finite Horizon Markov Decision Processes

机译:有限地平线Markov决策过程的Lagrange对偶分解

获取原文
获取外文期刊封面目录资料

摘要

Solving finite-horizon Markov Decision Processes with stationary policies is a computationally difficult problem. Our dynamic dual decomposition approach uses Lagrange duality to decouple this hard problem into a sequence of tractable sub-problems. The resulting procedure is a straightforward modification of standard non-stationary Markov Decision Process solvers and gives an upper-bound on the total expected reward. The empirical performance of the method suggests that not only is it a rapidly convergent algorithm, but that it also performs favourably compared to standard planning algorithms such as policy gradients and lower-bound procedures such as Expectation Maximisation.
机译:解决有限地平线马尔可夫决策过程,静止政策是一个计算困难的问题。我们的动态双分解方法使用Lagrange Tuegity将此难题与遗传序列分离成一系列易丢失的子问题。由此产生的程序是标准非静止马尔可夫决策过程求解器的简单修改,并在总预期奖励中提供了上限。该方法的实证性能表明,与标准规划算法(如策略梯度等诸如期望最大化)等标准规划算法相比,它也不仅是迅速收敛算法,而且还表现出有利地执行。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号