首页> 外文期刊>IIE Transactions >A simulation-based learning automata framework for solving semi-Markov decision problems sunder long-run average reward
【24h】

A simulation-based learning automata framework for solving semi-Markov decision problems sunder long-run average reward

机译:基于模拟的学习自动机框架,用于解决长期平均奖励下的半马尔可夫决策问题

获取原文
获取原文并翻译 | 示例
       

摘要

Many problems of sequential decision making under uncertainty, whose underlying probabilistic structure has a Markov chain, can be set up as Markov Decision Problems (MDPs). However, when their underlying transition mechanism cannot be characterized by the Markov chain alone, the problems may be set up as Semi-Markov Decision Problems (SMDPs). The framework of dynamic programming has been used extensively in the literature to solve such problems. An alternative framework that exists in the literature is that of the Learning Automata (LA). This framework can be combined with simulation to develop convergent LA algorithms for solving MDPs under long-run cost (or reward). A very attractive feature of this framework is that it avoids a major stumbling block of dynamic programming; that of having to compute the one-step transition probability matrices of the Markov chain for every possible action of the decision-making process. In this paper, we extend this framework to the more general SMDP. We also present numerical results on a case study from the domain of preventive maintenance in which the decision-making problem is modeled as a SMDP. An algorithm based on LA theory is employed, which may be implemented in a simulator as a solution method. It produces satisfactory results in all the numerical examples studied.
机译:不确定条件下的顺序决策的许多问题可以设置为马尔可夫决策问题(MDP),这些问题的潜在概率结构具有马尔可夫链。但是,当其潜在的转移机制不能仅通过马尔可夫链来表征时,可以将这些问题设置为半马尔可夫决策问题(SMDP)。动态编程的框架已在文献中广泛用于解决此类问题。文献中存在的替代框架是学习自动机(LA)的框架。该框架可以与仿真相结合,以开发收敛的LA算法来解决长期成本(或报酬)下的MDP。该框架的一个非常吸引人的特征是它避免了动态编程的主要绊脚石。必须为决策过程的每个可能动作计算马尔可夫链的一步转移概率矩阵。在本文中,我们将此框架扩展到更通用的SMDP。我们还从预防性维护领域的案例研究中给出了数值结果,在该案例中,决策问题被建模为SMDP。采用基于LA理论的算法,其可以在模拟器中作为解决方法来实现。在所研究的所有数值示例中,均产生令人满意的结果。

著录项

  • 来源
    《IIE Transactions》 |2004年第6期|p.557-567|共11页
  • 作者单位

    Department of Industrial Engineering, State University of New York at Buffalo, Buffalo, NY 14260, USA;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 一般工业技术;
  • 关键词

  • 入库时间 2022-08-18 03:51:12

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号