首页> 外文期刊>Mathematical methods of operations research >Compactness of the space of non-randomized policies in countable-state sequential decision processes
【24h】

Compactness of the space of non-randomized policies in countable-state sequential decision processes

机译:可数状态顺序决策过程中非随机策略空间的紧凑性

获取原文
获取原文并翻译 | 示例
       

摘要

For sequential decision processes with countable state spaces, we prove compactness of the set of strategic measures corresponding to nonrandomized policies. For the Borel state case, this set may not be compact (Piunovskiy, Optimal control of random sequences in problems with constraints. Kluwer, Boston, p. 170, 1997) in spite of compactness of the set of strategic measures corresponding to all policies (Schal, On dynamic programming: compactness of the space of policies. Stoch Processes Appl 3(4):345-364, 1975b; Balder, On compactness of the space of policies in stochastic dynamic programming. Stoch Processes Appl 32(1):141-150, 1989). We use the compactness result from this paper to show the existence of optimal policies for countable-state constrained optimization of expected discounted and nonpositive rewards, when the optimality is considered within the class of nonrandomized policies. This paper also studies the convergence of a value-iteration algorithm for such constrained problems.
机译:对于具有可数状态空间的顺序决策过程,我们证明了与非随机策略相对应的战略措施集的紧凑性。对于Borel州的情况,尽管与所有政策相对应的战略措施集较为紧凑,但该集可能并不紧凑(Piunovskiy,约束条件下的随机序列的最优控制。Kluwer,Boston,第170页,1997年)。 Schal,关于动态编程:策略空间的紧凑性。Stoch Processs Appl 3(4):345-364,1975b; Balder,关于随机动态编程中策略空间的紧凑性.Stoch Processs Appl 32(1):141 -150,1989年。我们使用本文的紧致性结果表明,当在非随机策略类别中考虑最优性时,存在可预期的折扣和非肯定奖励的可数状态约束优化的最优策略的存在。本文还研究了针对此类约束问题的值迭代算法的收敛性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号