On Polynomial Sized MDP Succinct Policies

Paolo Liberatore

首页> 外文期刊>The Journal of Artificial Intelligence Research >On Polynomial Sized MDP Succinct Policies

【24h】

On Polynomial Sized MDP Succinct Policies

机译：关于多项式大小的MDP简洁策略

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Policies of Markov Decision Processes (MDPs) determine the next action to execute from the current state and, possibly, the history (the past states). When the number of states is large, succinct representations are often used to compactly represent both the MDPs and the policies in a reduced amount of space. In this paper, some problems related to the size of succinctly represented policies are analyzed. Namely, it is shown that some MDPs have policies that can only be represented in space super-polynomial in the size of the MDP, unless the polynomial hierarchy collapses. This fact motivates the study of the problem of deciding whether a given MDP has a policy of a given size and reward. Since some algorithms for MDPs work by finding a succinct representation of the value function, the problem of deciding the existence of a succinct representation of a value function of a given size and reward is also considered.

机译：马尔可夫决策过程（MDP）的策略根据当前状态以及可能的历史记录（过去的状态）来确定要执行的下一个动作。当状态数很大时，通常使用简洁的表示形式在较小的空间中紧凑地表示MDP和策略。本文分析了与简洁代表的政策规模有关的一些问题。即，示出了某些MDP具有只能以MDP的大小在空间超多项式中表示的策略，除非多项式层次结构崩溃。这一事实激发了对确定给定MDP是否具有给定规模和奖励策略的问题的研究。由于用于MDP的某些算法通过找到价值函数的简洁表示来工作，因此还考虑了确定给定大小和奖励的价值函数的简洁表示存在的问题。

著录项

来源
《The Journal of Artificial Intelligence Research》 |2004年第0期|共27页
作者
Paolo Liberatore;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. On Polynomial Sized MDP Succinct Policies [J] . Liberatore P. The Journal of Artificial Intelligence Research . 2004,第12期

机译：关于多项式大小的MDP简洁策略
2. Strong polynomiality of policy iterations for average-cost MDPs modeling replacement and maintenance problems [J] . Feinberg E.A., Huang J. Operations Research Letters: A Journal of the Operations Research Society of America . 2013,第3期

机译：用于平均成本MDP建模替换和维护问题的策略迭代的强多项式
3. Permanent does not have succinct polynomial size arithmetic circuits of constant depth [J] . Maurice Jansen, Rahul Santhanam Information and computation . 2013,第JANa期

机译：永久没有恒定深度的简洁的多项式大小算术电路
4. Computational Approaches for Stochastic Shortest Path on Succinct MDPs [C] . Krishnendu Chatterjee, Hongfei Fu, Amir Goharshady, International Joint Conference on Artificial Intelligence . 2018

机译：简化MDPS的随机最短路径的计算方法
5. Lip Synchronization for ECA Rendering with Self-Adjusted POMDP Policies [D] . Szucs, Tristan. 2019

机译：ECA渲染与自我调整POMDP政策的唇部同步
6. MDPs with Non-Deterministic Policies [O] . Mahdi Milani Fard, Joelle Pineau -1

机译：具有不确定性策略的MDP
7. On Polynomial Sized MDP Succinct Policies [O] . Paolo Liberatore 2013

机译：关于多项式mDp简洁政策
8. Polynomial-Time Verification of PCTL Properties of MDPs with Convex Uncertainties. [R] . Puggelli, A. A., Li, W., Sangiovanni-Vincentelli, A. L., 2013

机译：具有凸不确定性的mDp的pCTL性质的多项式时间验证。

On Polynomial Sized MDP Succinct Policies

摘要

著录项

相似文献

相关主题

期刊订阅