Polynomial Time Algorithms for Branching Markov Decision Processes and Probabilistic Min (Max) Polynomial Bellman Equations

首页> 外文期刊>Mathematics of operations research >Polynomial Time Algorithms for Branching Markov Decision Processes and Probabilistic Min (Max) Polynomial Bellman Equations

【24h】

Polynomial Time Algorithms for Branching Markov Decision Processes and Probabilistic Min (Max) Polynomial Bellman Equations

机译：分支马尔可夫决策过程的多项式时间算法和概率分钟（MAX）多项式贝尔曼方程

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We show that one can compute the least nonnegative solution (also known as the least fixed point) for a system of probabilistic min (max) polynomial equations, to any desired accuracy epsilon > 0 in time polynomial in both the encoding size of the system and in log(1/epsilon). These are Bellman optimality equations for important classes of infinite-state Markov decision processes (MDPs), including branching MDPs (BMDPs), which generalize classic multitype branching stochastic processes. We thus obtain the first polynomial time algorithm for computing, to any desired precision, optimal (maximum and minimum) extinction probabilities for BMDPs. Our algorithms are based on a novel generalization of Newton's method, which employs linear programming in each iteration. We also provide polynomial-time (P-time) algorithms for computing an epsilon-optimal policy for both maximizing and minimizing extinction probabilities in a BMDP, whereas we note a hardness result for computing an exact optimal policy. Furthermore, improving on prior results, we provide more efficient P-time algorithms for qualitative analysis of BMDPs, that is, for determining whether the maximum or minimum extinction probability is 1, and, if so, computing a policy that achieves this. We also observe some complexity consequences of our results for branching simple stochastic games, which generalize BMDPs.

机译：我们表明，可以在系统的编码大小中计算概率最小值（MAX多项式方程的系统的最小非负解（也称为最小固定点），以时间多项式在时间多项式中的任何期望的精度epsilon> 0在log（1 / epsilon）中。这些是关于重要类别的无限状态马尔可夫决策过程（MDP）的重要类别的贝尔曼最优性方程，包括分支MDP（BMDP），其概括了经典多重分支分支随机过程。因此，我们获得了用于计算的第一多项式时间算法，以任何所需的精度，最佳（最大和最小）的BMDPS消失概率。我们的算法基于牛顿方法的新推广，在每次迭代中采用线性编程。我们还提供多项式 - 时间（P-Time）算法，用于计算BMDP中最大化和最小化灭绝概率的epsilon最佳政策，而我们注意到计算精确的最佳政策的硬度结果。此外，提高了先前结果，我们提供了更有效的P-Time算法，用于BMDP的定性分析，即确定最大或最小灭绝概率是否为1，并且如果是，则计算实现这一目标的策略。我们还遵守我们对分支简单随机游戏的结果的一些复杂性后果，这概括了BMDP。

著录项

来源
《Mathematics of operations research》 |2020年第1期|共29页
作者

展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类运筹学;
关键词
multitype branching processes; Markov decision processes; Bellman optimality equations; generalized Newton method; polynomial time algorithms;

机译：多重分支过程;马尔可夫决策过程;贝尔曼最优性方程;广义牛顿方法;多项式时间算法;
入库时间 2022-08-20 03:46:34

相似文献

外文文献
中文文献
专利

1. Polynomial Time Algorithms for Branching Markov Decision Processes and Probabilistic Min (Max) Polynomial Bellman Equations [J] . Mathematics of operations research . 2020,第1期

机译：分支马尔可夫决策过程的多项式时间算法和概率分钟（MAX）多项式贝尔曼方程
2. Polynomial time decision algorithms for probabilistic automata [J] . Andrea Turrini, Holger Hermanns Information and computation . 2015,第OCTa期

机译：概率自动机的多项式时间决策算法
3. Polynomial Time Algorithm for Determining Max-Min Paths in Networks and Solving Zero Value Cyclic Games [J] . Dmitrii D. Lozovanu Computer science journal of Moldova . 2005,第2期

机译：确定网络中最大最小路径并求解零值循环博弈的多项式时间算法
4. Polynomial Time Algorithms for Branching Markov Decision Processes and Probabilistic Min(Max) Polynomial Bellman Equations [C] . Kousha Etessami, Alistair Stewart, Mihalis Yannakakis International Colloquium on Automata, Languages and Programming . 2012

机译：分支马尔可夫决策过程的多项式时间算法和概率分钟（MAX）多项式贝尔曼方程
5. Algorithms for solving linear and polynomial systems of equations over finite fields, with applications to cryptanalysis. [D] . Bard, Gregory V. 2007

机译：用于求解有限域上方程的线性和多项式系统的算法，并应用于密码分析。
6. Polynomial algorithms for the Maximal Pairing Problem: efficient phylogenetic targeting on arbitrary trees [O] . Christian Arnold, Peter F Stadler 2010

机译：最大配对问题的多项式算法：针对任意树的有效系统发育目标
7. Polynomial Time Algorithms for Branching Markov Decision Processes and Probabilistic Min(Max) Polynomial Bellman Equations [O] . Etessami, Kousha, Stewart, Alistair, Yannakakis, Mihalis 2012

机译：分支马尔可夫决策过程的多项式时间算法和概率最小（max）多项式Bellman方程

Polynomial Time Algorithms for Branching Markov Decision Processes and Probabilistic Min (Max) Polynomial Bellman Equations

摘要

著录项

相似文献

相关主题

期刊订阅