首页> 外文期刊>Formal Methods in System Design >Symbolic algorithms for qualitative analysis of Markov decision processes with Buechi objectives
【24h】

Symbolic algorithms for qualitative analysis of Markov decision processes with Buechi objectives

机译:具有Buechi目标的Markov决策过程的定性分析的符号算法

获取原文
获取原文并翻译 | 示例
           

摘要

We consider Markov decision processes (MDPs) with Buechi (liveness) objectives. We consider the problem of computing the set of almost-sure winning states from where the objective can be ensured with probability 1. Our contributions are as follows: First, we present the first subquadratic symbolic algorithm to compute the almost-sure winning set for MDPs with Biichi objectives; our algorithm takes O(n • m~(1/2)) symbolic steps as compared to the previous known algorithm that takes O(n~2) symbolic steps, where n is the number of states and m is the number of edges of the MDP. In practice MDPs have constant out-degree, and then our symbolic algorithm takes O(n • n~(1/2)) symbolic steps, as compared to the previous known O(n~2) symbolic steps algorithm. Second, we present a new algorithm, namely win-lose algorithm, with the following two properties: (a) the algorithm iteratively computes subsets of the almost-sure winning set and its complement, as compared to all previous algorithms that discover the almost-sure winning set upon termination; and (b) requires O(n • k~(1/2)) symbolic steps, where K is the maximal number of edges of strongly connected components (scc's) of the MDP. The win-lose algorithm requires symbolic computation of scc's. Third, we improve the algorithm for symbolic sec computation; the previous known algorithm takes linear symbolic steps, and our new algorithm improves the constants associated with the linear number of steps. In the worst case the previous known algorithm takes 5•n symbolic steps, whereas our new algorithm takes 4•n symbolic steps.
机译:我们考虑具有Buechi(活动性)目标的Markov决策过程(MDP)。我们考虑计算可以从中以概率1保证目标的几乎确定的获胜状态集的问题。我们的贡献如下:首先,我们提出第一个二次符号算法来计算MDP的几乎确定的获胜集。以Biichi目标为目标;与之前已知的采用O(n〜2)符号步长的已知算法相比,我们的算法采用O(n•m〜(1/2))个符号步长,其中n是状态数,m是边缘的数量MDP。在实践中,MDP具有恒定的出度,因此与先前已知的O(n〜2)符号步长算法相比,我们的符号算法采用O(n•n〜(1/2))符号步长。其次,我们提出了一种新的算法,即输赢算法,它具有以下两个属性:(a)与发现几乎以下情况的所有先前算法相比,该算法迭代地计算几乎确定的获胜集合及其补集的子集:终止时确定赢钱; (b)需要O(n•k〜(1/2))个符号步,其中K是MDP的强连接组件(scc)的最大边数。双输算法需要对scc进行符号计算。第三,我们改进了符号秒计算的算法。先前的已知算法采用线性符号步长,而我们的新算法改进了与线性步长相关的常数。在最坏的情况下,以前的已知算法需要5•n个符号步,而我们的新算法需要4•n个符号步。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号