首页> 外文学位 >A hybrid genetic/optimization algorithm for piecewise affine and convex Markov decision processes.

【24h】

A hybrid genetic/optimization algorithm for piecewise affine and convex Markov decision processes.

机译：分段仿射和凸马尔可夫决策过程的混合遗传/优化算法。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

A variety of finite horizon models of sequential decision making under uncertainty have value functions that are piecewise affine and convex ( PAC) or equivalently, can be represented by a finite set of affine functions. Such models include: the partially observed Markov decision problem (POMDP) and the Markov decision problem (MDP) with affine single-step and/or terminal reward function, examples of which are the multi-objective MDP and the nonhomogeneous MDP (NMDP). We develop both optimal and suboptimal solution procedures for this class of models. Our optimal solution approach, GAMIP, integrates a genetic algorithm (GA) and a mixed integer program (MIP). The genetic algorithm step only is a suboptimal design with good solution quality. Both optimal approach and suboptimal design construct the minimal set of affine functions that describe the value function. The POMDP and NMDP are investigated numerically.; The POMDP is a generalization of a MDP that allows for observations that are probabilistically related to the underlying system state. The value function of the finite horizon POMDP is known to be PAC in the probability mass vector over the state space. For suboptimal design, we develop two heuristic approaches based on genetic algorithms, termed as dc-niche and spatial, to construct approximations of the minimal set of affine functions that describes the value function. Error bounds of these two approaches are presented. Numerical results indicate that these heuristics can quickly yield high quality solutions for a set of small-scale problems where exact solutions are feasible and can provide suboptimal designs for realistically sized problems. Numerical results indicate that GAMIP takes up to 60% less time than the most efficient linear programming-based exact solution method in the literature to construct the minimal set.; For the study of an infinite horizon NMDP, the objective is to determine an optimal initial action. We transform this infinite horizon NMDP into a finite horizon NMDP having a terminal reward vector described by set inclusion. Our approach involves finding a forecast horizon, a problem horizon such that an optimal initial decision for the finite horizon problem is optimal for the infinite horizon problem. We make use of the fact that the value function of the finite horizon NMDP having a terminal reward vector described by set inclusion is PAC in the terminal reward vector. Our solution procedure is based on a combination of the genetic algorithm and mixed integer programming. We show that a suboptimal procedure based solely on the genetic algorithm quickly identifies high quality candidates for the minimal forecast horizon. The mixed integer program is used to confirm or rule out a candidate for a forecast horizon. We then present a numerical evaluation that compares our approach with several other algorithms.

机译：不确定条件下的顺序决策的各种有限视野模型具有分段仿射和凸（ PAC ）或等效的值函数，可以由有限组仿射函数表示。此类模型包括：具有仿射单步和/或最终奖励功能的部分观测的Markov决策问题（ POMDP ）和Markov决策问题（ MDP ），其示例是多目标 MDP 和非同质 MDP （ NMDP ）。我们为此类模型开发了最优和次优的求解程序。我们的最佳解决方案 GAMIP 集成了遗传算法（ GA ）和混合整数程序（ MIP ）。遗传算法步骤只是解决方案质量较差的次优设计。最优方法和次优设计都构成了描述值函数的最小仿射函数集。数值研究了 POMDP 和 NMDP 。 POMDP 是 MDP 的概括，它允许与基础系统状态概率相关的观察。在状态空间上的概率质量向量中，有限水平的 POMDP 的值函数已知为 PAC 。对于次优设计，我们基于遗传算法开发了两种启发式方法，分别称为dc小生境和空间算法，以构造描述值函数的最小仿射函数集的近似值。介绍了这两种方法的误差范围。数值结果表明，这些启发式方法可以针对一组小规模问题快速生成高质量的解决方案，其中精确的解决方案是可行的，并且可以为实际大小的问题提供次优设计。数值结果表明， GAMIP 花费的时间比文献中最有效的基于线性规划的精确解方法构造最小集所需的时间少60％。为了研究无限水平的 NMDP ，目标是确定最佳的初始动作。我们将此无限视野 NMDP 转换为有限视野 NMDP ，其具有通过集合包含描述的最终奖励向量。我们的方法涉及找到一个预测范围，即一个问题范围，从而使有限范围问题的最优初始决策对于无限范围问题是最优的。我们利用以下事实：在最终奖励向量中，具有通过集合包含描述的最终奖励向量的有限范围 NMDP 的值函数为 PAC 。我们的解决方案基于遗传算法和混合整数编程。我们表明，仅基于遗传算法的次优过程可以快速识别出最低预测范围的高质量候选对象。混合整数程序用于确认或排除预测范围的候选对象。然后，我们提出了一个数值评估，将我们的方法与其他几种算法进行了比较。

著录项

作者
Lin, Zong-Zhi.;
展开▼
作者单位

University of Michigan.;

展开▼
授予单位 University of Michigan.;
学科 Operations Research.; Computer Science.
学位 Ph.D.
年度 1999
页码 101 p.
总页数 101
原文格式 PDF
正文语种 eng
中图分类运筹学;自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. A Hybrid Genetic/Optimization Algorithm for Finite-Horizon, Partially Observed Markov Decision Processes [J] . Zong-Zhi Lin, James C. Bean, Chelsea C. White III ORSA Journal on Computing . 2004,第1期

机译：有限视野，部分观测的马尔可夫决策过程的混合遗传/优化算法
2. A Hybrid Genetic/Optimization Algorithm for Finite-Horizon, Partially Observed Markov Decision Processes [J] . Zong-Zhi Lin, James C. Bean, Chelsea C. White III ORSA Journal on Computing . 2004,第1期

机译：有限视野，部分观测的马尔可夫决策过程的混合遗传/优化算法
3. Hybrid Genetic Algorithm and Invasive Weed Optimization via Priority Based Encoding for Location-Allocation Decisions in a Three-Stage Supply Chain [J] . Atabaki Mohammad Saeid, Mohammadi Mohammad, Naderi Bahman Asia-Pacific Journal of Operational Research . 2017,第2期

机译：混合遗传算法和基于优先级编码的入侵杂草优化用于三阶段供应链中的位置分配决策
4. Active Fault Diagnosis Based on Consistencies to a Class of Hybrid Systems by Using Genetic Algorithms and Markov Decision Process [C] . Iván Escandón, Mariela Cerrada IEEE ANDESCON Conference . 2020

机译：通过使用遗传算法和马尔可夫决策过程，基于一类混合系统的常态故障诊断
5. MICA: A Hybrid Method for Corpus-Based Algorithmic Composition of Music Based on Genetic Algorithms, Zipf's Law, and Markov Models [D] . Nagelberg, Alan 2014

机译：MICA：基于遗传算法，齐普夫定律和马尔可夫模型的基于语料库的音乐算法合成的混合方法
6. Spatial and temporal regularization to estimate COVID-19 reproduction number R(t): Promoting piecewise smoothness via convex optimization [O] . Patrice Abry, Nelly Pustelnik, Stéphane Roux, 2020

机译：空间和时间正则正规估计Covid-19再现号R（t）：通过凸优化促进分段平滑度
7. PRM14 Calibration of Piecewise Markov Models Using a Change-Point Analysis Through an Iterative Convex Optimization Algorithm [O] . Alarid-Escudero F, Enns E, Peralta-Torres YE, 2015

机译：通过迭代凸优化算法使用变化点分析对分段马尔可夫模型进行PRM14校准
8. Computational Comparison of Value Iteration Algorithms for Discounted Markov Decision Processes. [R] . Thomas, L. C., Hartley, R., Lavercombe, A. C. 1982

机译：马尔可夫决策过程的价值迭代算法计算比较。

A hybrid genetic/optimization algorithm for piecewise affine and convex Markov decision processes.

摘要

著录项

相似文献

相关主题

期刊订阅