首页> 外文会议>Conference on uncertainty in artificial intelligence >Solution Methods for Constrained Markov Decision Process with Continuous Probability Modulation
【24h】

Solution Methods for Constrained Markov Decision Process with Continuous Probability Modulation

机译:连续概率调制约束马尔可夫决策过程的求解方法

获取原文

摘要

We propose solution methods for previously-unsolved constrained MDPs in which actions can continuously modify the transition probabilities within some acceptable sets. While many methods have been proposed to solve regular MDPs with large state sets, there are few practical approaches for solving constrained MDPs with large action sets. In particular, we show that the continuous action sets can be replaced by their extreme points when the rewards are linear in the modulation. We also develop a tractable optimization formulation for concave reward functions and, surprisingly, also extend it to non-concave reward functions by using their concave envelopes. We evaluate the effectiveness of the approach on the problem of managing delinquencies in a portfolio of loans.
机译:我们为先前未解决的受约束的MDP提出了解决方法,其中动作可以在某些可接受的范围内连续修改转移概率。虽然已经提出了许多方法来解决具有大状态集的常规MDP,但很少有实用的方法来解决具有大动作集的约束MDP。特别地,我们表明,当奖励在调制中呈线性时,连续动作集可以用其极端值代替。我们还为凹形奖励函数开发了易于处理的优化公式,令人惊讶的是,还通过使用凹形信封将其扩展到了非凹形奖励函数。我们评估该方法在管理贷款组合中的违约问题上的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号