...
首页> 外文期刊>Naval Research Logistics >Strategic Capacity Decision-Making in a Stochastic Manufacturing Environment Using Real-Time Approximate Dynamic Programming
【24h】

Strategic Capacity Decision-Making in a Stochastic Manufacturing Environment Using Real-Time Approximate Dynamic Programming

机译:使用实时近似动态规划的随机制造环境中的战略能力决策

获取原文
获取原文并翻译 | 示例

摘要

In this study, we illustrate a real-time approximate dynamic programming (RTADP) method for solving multistage capacity decision problems in a stochastic manufacturing environment, by using an exemplary three-stage manufacturing system with recycle. The system is a moderate size queuing network, which experiences stochastic variations in demand and product yield. The dynamic capacity decision problem is formulated as a Markov decision process (MDP). The proposed RTADP method starts with a set of heuristics and learns a superior quality solution by interacting with the stochastic system via simulation. The curse-of-dimensionality associated with DP methods is alleviated by the adoption of several notions including "evolving set of relevant states," for which the value function table is built and updated, "adaptive action set" for keeping track of attractive action candidates, and "nonparametric k nearest neighbor averager" for value function approximation. The performance of the learned solution is evaluated against (1) an "ideal" solution derived using a mixed integer programming (MIP) formulation, which assumes full knowledge of future realized values of the stochastic variables (2) a myopic heuristic solution, and (3) a sample path based rolling horizon MIP solution. The policy learned through the RTADP method turned out to be superior to polices of 2 and 3.
机译:在这项研究中,我们通过使用示例性的带回收的三阶段制造系统,说明了一种实时近似动态规划(RTADP)方法,用于解决随机制造环境中的多阶段产能决策问题。该系统是中等大小的排队网络,其需求和产品产量会发生随机变化。动态容量决策问题被表述为马尔可夫决策过程(MDP)。提出的RTADP方法从一组启发式方法开始,并通过仿真与随机系统进行交互,从而学习了优质的解决方案。与DP方法相关的维数诅咒通过采用几种概念得到缓解,这些概念包括“相关状态的演化集”(为其建立和更新值功能表),“自适应动作集”以跟踪有吸引力的动作候选者,以及“非参数k最近邻平均器”,用于近似值函数。相对于(1)使用混合整数编程(MIP)公式得出的“理想”解决方案来评估学习的解决方案的性能,该方案假定完全了解随机变量的未来实现值(2)近视启发式解决方案,以及( 3)基于样本路径的滚动MIP解决方案。通过RTADP方法获知的策略优于2和3的策略。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号