Sampled fictitious play for multi-action stochastic dynamic programs

ARCHIS GHATE; SHIH-FEN CHENG; STEPHEN BAUMERT; DANIEL REAUME; DUSHYANT SHARMA; ROBERT L. SMITH

首页> 外文期刊>IIE Transactions >Sampled fictitious play for multi-action stochastic dynamic programs

【24h】

Sampled fictitious play for multi-action stochastic dynamic programs

机译：多动作随机动态程序的虚拟游戏样本

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This article introduces a class of finite-horizon dynamic optimization problems that are called multi-action stochastic Dynamic Programs (DPs). Their distinguishing feature is that the decision in each state is a multi-dimensional vector. These problems can in principle be solved using Bellman's backward recursion. However, the complexity of this procedure grows exponentially in the dimension of the decision vectors. This is called the curse of action space dimensionality. To overcome this computational challenge, an approximation algorithm is proposed that is rooted in the game-theoretic paradigm of Sampled Fictitious Play (SFP). SFP solves a sequence of DPs with a one-dimensional action space that are exponentially smaller than the original multi-action stochastic DP. In particular, the computational effort in a fixed number of SFP iterations is linear in the dimension of the decision vectors. It is shown that the sequence of SFP iterates converges to a local optimum, and a numerical case study in manufacturing is presented in which SFP is able to find solutions with objective values within 1％ of the optimal objective value hundreds of times faster than the time taken by backward recursion. In this case study, SFP solutions are also better by a statistically significant margin than those found by a one-step look ahead heuristic.

机译：本文介绍了一类称为多动作随机动态程序（DP）的有限水平动态优化问题。它们的区别在于，每个状态下的决策都是多维向量。这些问题原则上可以使用Bellman的向后递归来解决。但是，此过程的复杂度在决策向量的维度上呈指数增长。这被称为动作空间维数的诅咒。为了克服这一计算难题，提出了一种近似算法，该算法植根于采样虚拟游戏（SFP）的博弈论范例中。 SFP解决了一系列具有一维动作空间的DP，这些动作空间比原始的多动作随机DP指数减小。特别地，固定数量的SFP迭代中的计算工作量在决策向量的维度上是线性的。结果表明，SFP迭代序列收敛到局部最优，并且在制造过程中进行了数值案例研究，其中SFP能够找到目标值在最佳目标值的1％以内的解决方案，其速度比时间快数百倍。通过向后递归获取。在此案例研究中，SFP解决方案在统计上也要比单步向前启发式发现的解决方案好。

著录项

来源
《IIE Transactions》 |2014年第7期|742-756|共15页
作者
ARCHIS GHATE; SHIH-FEN CHENG; STEPHEN BAUMERT; DANIEL REAUME; DUSHYANT SHARMA; ROBERT L. SMITH;
展开▼
作者单位

Industrial Engineering, University of Washington, Seattle, WA 98195, USA;

School of Information Systems, Singapore Management University, Singapore;

United States Postal Service, Arlington, VA, USA;

Operations Research Activity, General Motors R&D and Strategic Planning, Warren, MI, USA;

Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI, USA;

Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI, USA;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Approximate dynamic programming; game theory; operations management;

机译：近似动态编程;博弈论营运管理;

相似文献

外文文献
中文文献
专利

1. Parameter-Free Sampled Fictitious Play for Solving Deterministic Dynamic Programming Problems [J] . Dolinskaya Irina S., Epelman Marina A., Sir Esra Sisikoglu, Journal of Optimization Theory and Applications . 2016,第2期

机译：解决确定性动态规划问题的无参数采样虚拟游戏
2. Sampled fictitious play for approximate dynamic programming [J] . Marina Epelman, Archis Ghate, Robert L Smith Computers & operations research . 2011,第12期

机译：虚拟游戏样本用于近似动态编程
3. Dynamic Fictitious Play, Dynamic Gradient Play, and Distributed Convergence to Nash Equilibria [J] . Jeff S. Shamma, Gurdal Arslan IEEE Transactions on Automatic Control . 2005,第3期

机译：动态虚拟游戏，动态渐变游戏和分布式收敛到纳什均衡
4. Utility-based Stochastic Fictitious Play and Wireless Medium Access Control Design [C] . Wenyuan Liu, Haiqiang Wang, Huixian Liu, . 2007

机译：基于实用程序的随机虚拟播放和无线媒体访问控制设计
5. Sampled joint strategy fictitious play: A new approach to discrete optimization via simulation. [D] . Sinha, Kaushik. 2006

机译：联合战略样本虚拟游戏：一种通过仿真进行离散优化的新方法。
6. Stochasticity of replication forks’ speeds plays a key role in the dynamics of DNA replication [O] . Razie Yousefi, Maga Rowicka 2019

机译：复制叉的随机性在DNA复制动力学中起关键作用
7. Parameter-Free Sampled Fictitious Play for Solving Deterministic Dynamic Programming Problems [O] . Irina S. Dolinskaya, Marina A. Epelman, Esra Şişikoğlu Sir, 2015

机译：无参数采样虚拟游戏解决确定性动态规划问题
8. Convergence of Sample Path Optimal Policies for Stochastic Dynamic Programming [R] . Fu, M. C. , Jin, X. 2005

机译：随机动态规划的样本路径最优策略的收敛性

Sampled fictitious play for multi-action stochastic dynamic programs

摘要

著录项

相似文献

相关主题

期刊订阅