Compactness of the space of non-randomized policies in countable-state sequential decision processes

Chen RC; Feinberg EA

首页> 外文期刊>Mathematical methods of operations research >Compactness of the space of non-randomized policies in countable-state sequential decision processes

【24h】

Compactness of the space of non-randomized policies in countable-state sequential decision processes

机译：可数状态顺序决策过程中非随机策略空间的紧凑性

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

For sequential decision processes with countable state spaces, we prove compactness of the set of strategic measures corresponding to nonrandomized policies. For the Borel state case, this set may not be compact (Piunovskiy, Optimal control of random sequences in problems with constraints. Kluwer, Boston, p. 170, 1997) in spite of compactness of the set of strategic measures corresponding to all policies (Schal, On dynamic programming: compactness of the space of policies. Stoch Processes Appl 3(4):345-364, 1975b; Balder, On compactness of the space of policies in stochastic dynamic programming. Stoch Processes Appl 32(1):141-150, 1989). We use the compactness result from this paper to show the existence of optimal policies for countable-state constrained optimization of expected discounted and nonpositive rewards, when the optimality is considered within the class of nonrandomized policies. This paper also studies the convergence of a value-iteration algorithm for such constrained problems.

机译：对于具有可数状态空间的顺序决策过程，我们证明了与非随机策略相对应的战略措施集的紧凑性。对于Borel州的情况，尽管与所有政策相对应的战略措施集较为紧凑，但该集可能并不紧凑（Piunovskiy，约束条件下的随机序列的最优控制。Kluwer，Boston，第170页，1997年）。 Schal，关于动态编程：策略空间的紧凑性。Stoch Processs Appl 3（4）：345-364，1975b; Balder，关于随机动态编程中策略空间的紧凑性.Stoch Processs Appl 32（1）：141 -150，1989年。我们使用本文的紧致性结果表明，当在非随机策略类别中考虑最优性时，存在可预期的折扣和非肯定奖励的可数状态约束优化的最优策略的存在。本文还研究了针对此类约束问题的值迭代算法的收敛性。

著录项

来源
《Mathematical methods of operations research》 |2010年第2期|共17页
作者
Chen RC; Feinberg EA;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类运筹学;
关键词
Markov decision processes; Compactness; Non-randomized policies;

机译：马尔可夫决策过程;紧凑性;非随机策略;

相似文献

外文文献
中文文献
专利

1. Compactness of the space of non-randomized policies in countable-state sequential decision processes [J] . Chen RC, Feinberg EA Mathematical methods of operations research . 2010,第2期

机译：可数状态顺序决策过程中非随机策略空间的紧凑性
2. Non-randomized policies for constrained Markov decision processes [J] . Richard C. Chen, Eugene A. Feinberg Mathematical Methods of Operations Research . 2007,第1期

机译：约束马尔可夫决策过程的非随机策略
3. AVERAGE COST OPTIMALITY INEQUALITY FOR MARKOV DECISION PROCESSES WITH BOREL SPACES AND UNIVERSALLY MEASURABLE POLICIES [J] . Yu Huizhen SIAM Journal on Control and Optimization . 2020,第4期

机译：Markov决策过程的平均成本优化不等式与Borel空间和普遍可衡量的政策
4. Convex synthesis of optimal policies for Markov Decision Processes with sequentially-observed transitions [C] . Mahmoud El Chamie, Behçet Açıkmeşe American Control Conference . 2016

机译：具有顺序观察到的转移的马尔可夫决策过程的最优策略的凸综合
5. How honey bees use visual landmarks during goal-directed navigation: Wayfinding strategies as sequential decision-making processes. [D] . Bartlett, Francis Norman, III. 2006

机译：蜜蜂如何在目标导向的导航中使用视觉地标：寻路策略作为顺序决策过程。
6. Markov Decision Processes: A Tool for Sequential Decision Making under Uncertainty [O] . Oguzhan Alagoz, Heather Hsu, Andrew J. Schaefer, -1

机译：马尔可夫决策过程：在不确定性下连续决策的工具
7. Compactness of the Space of Non-Randomized Policies in Countable-State Sequential Decision Processes [O] . Richard C. Chen, et al. 2010

机译：可数状态顺序决策过程中非随机化策略空间的紧凑性

Compactness of the space of non-randomized policies in countable-state sequential decision processes

摘要

著录项

相似文献

相关主题

期刊订阅