首页> 外文期刊>International Journal of Information Technology & Decision Making >PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES AND PERIODIC POLICIES WITH APPLICATIONS
【24h】

PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES AND PERIODIC POLICIES WITH APPLICATIONS

机译:可部分观察的马尔可夫决策过程和周期性策略及其应用

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

This paper treats the infinite horizon discounted cost control problem for partially observable Markov decision processes. Sondik studied the class of finitely transient policies and showed that their value functions over an infinite time horizon are piecewise linear (p.w.l) and can be computed exactly by solving a system of linear equations. However, the condition for finite transience is stronger than is needed to ensure p.w.l. value functions. In this paper, we introduce alternatively the class of periodic policies whose value functions turn out to be also p.w.l. Moreover, we examine a more general condition than finite transience and periodicity that ensures p.w.l. value functions. We implement these ideas in a replacement problem under Markovian deterioration, investigate for periodic policies and give numerical examples.
机译:本文针对部分可观察的马尔可夫决策过程处理了无限期折现成本控制问题。 Sondik研究了有限暂态策略的类别,并表明它们在无限时间范围内的值函数是分段线性的(p.w.l),可以通过求解线性方程组来精确计算。但是,有限瞬变的条件比确保p.w.l所需的条件强。价值功能。在本文中,我们将另选一类周期性政策,其价值函数也被证明为p.w.l。而且,我们检查了比确保p.w.l的有限瞬态和周期性更一般的条件。价值功能。我们在马尔可夫式恶化下的替换问题中实施这些思想,研究周期性政策并给出数值示例。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号