...
首页> 外文期刊>IIE Transactions >Sampled fictitious play for multi-action stochastic dynamic programs
【24h】

Sampled fictitious play for multi-action stochastic dynamic programs

机译:多动作随机动态程序的虚拟游戏样本

获取原文
获取原文并翻译 | 示例
           

摘要

This article introduces a class of finite-horizon dynamic optimization problems that are called multi-action stochastic Dynamic Programs (DPs). Their distinguishing feature is that the decision in each state is a multi-dimensional vector. These problems can in principle be solved using Bellman's backward recursion. However, the complexity of this procedure grows exponentially in the dimension of the decision vectors. This is called the curse of action space dimensionality. To overcome this computational challenge, an approximation algorithm is proposed that is rooted in the game-theoretic paradigm of Sampled Fictitious Play (SFP). SFP solves a sequence of DPs with a one-dimensional action space that are exponentially smaller than the original multi-action stochastic DP. In particular, the computational effort in a fixed number of SFP iterations is linear in the dimension of the decision vectors. It is shown that the sequence of SFP iterates converges to a local optimum, and a numerical case study in manufacturing is presented in which SFP is able to find solutions with objective values within 1% of the optimal objective value hundreds of times faster than the time taken by backward recursion. In this case study, SFP solutions are also better by a statistically significant margin than those found by a one-step look ahead heuristic.
机译:本文介绍了一类称为多动作随机动态程序(DP)的有限水平动态优化问题。它们的区别在于,每个状态下的决策都是多维向量。这些问题原则上可以使用Bellman的向后递归来解决。但是,此过程的复杂度在决策向量的维度上呈指数增长。这被称为动作空间维数的诅咒。为了克服这一计算难题,提出了一种近似算法,该算法植根于采样虚拟游戏(SFP)的博弈论范例中。 SFP解决了一系列具有一维动作空间的DP,这些动作空间比原始的多动作随机DP指数减小。特别地,固定数量的SFP迭代中的计算工作量在决策向量的维度上是线性的。结果表明,SFP迭代序列收敛到局部最优,并且在制造过程中进行了数值案例研究,其中SFP能够找到目标值在最佳目标值的1%以内的解决方案,其速度比时间快数百倍。通过向后递归获取。在此案例研究中,SFP解决方案在统计上也要比单步向前启发式发现的解决方案好。

著录项

  • 来源
    《IIE Transactions》 |2014年第7期|742-756|共15页
  • 作者单位

    Industrial Engineering, University of Washington, Seattle, WA 98195, USA;

    School of Information Systems, Singapore Management University, Singapore;

    United States Postal Service, Arlington, VA, USA;

    Operations Research Activity, General Motors R&D and Strategic Planning, Warren, MI, USA;

    Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI, USA;

    Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI, USA;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Approximate dynamic programming; game theory; operations management;

    机译:近似动态编程;博弈论营运管理;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号