...
首页> 外文期刊>IEEE Transactions on Automatic Control >Percentile performance criteria for limiting average Markovdecision processes
【24h】

Percentile performance criteria for limiting average Markovdecision processes

机译:限制平均马尔可夫决策过程的百分比性能标准

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Addresses the following basic feasibility problem for infinite-horizon Markov decision processes (MDPs): can a policy be found that achieves a specified value (target) of the long-run limiting average reward at a specified probability level (percentile)? Related optimization problems of maximizing the target for a specified percentile and vice versa are also considered. The authors present a complete (and discrete) classification of both the maximal achievable target levels and of their corresponding percentiles. The authors also provide an algorithm for computing a deterministic policy corresponding to any feasible target-percentile pair. Next the authors consider similar problems for an MDP with multiple rewards and/or constraints. This case presents some difficulties and leads to several open problems. An LP-based formulation provides constructive solutions for most cases
机译:解决了无限水平马尔可夫决策过程(MDP)的以下基本可行性问题:是否可以找到一种策略,以指定的概率水平(百分位数)实现长期限制平均奖励的指定值(目标)?还考虑了将目标最大化指定百分位数(反之亦然)的相关优化问题。作者提出了最大可达到的目标水平及其相应百分位数的完整(和离散)分类。作者还提供了一种算法,用于计算与任何可行的目标百分位数对相对应的确定性策略。接下来,作者考虑具有多个奖励和/或约束的MDP的类似问题。这种情况带来一些困难,并导致一些未解决的问题。基于LP的配方为大多数情况提供了建设性的解决方案

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号