首页> 外文会议>Conference on Uncertainty in Artificial Intelligence >Solution Methods for Constrained Markov Decision Process with Continuous Probability Modulation
【24h】

Solution Methods for Constrained Markov Decision Process with Continuous Probability Modulation

机译:具有连续概率调制的受约束Markov决策过程的解决方案方法

获取原文

摘要

We propose solution methods for previously-unsolved constrained MDPs in which actions can continuously modify the transition probabilities within some acceptable sets. While many methods have been proposed to solve regular MDPs with large state sets, there are few practical approaches for solving constrained MDPs with large action sets. In particular, we show that the continuous action sets can be replaced by their extreme points when the rewards are linear in the modulation. We also develop a tractable optimization formulation for concave reward functions and, surprisingly, also extend it to non-concave reward functions by using their concave envelopes. We evaluate the effectiveness of the approach on the problem of managing delinquencies in a portfolio of loans.
机译:我们提出了用于预先固化的受限MDP的解决方案方法,其中动作可以在某种可接受的集合中连续地修改过渡概率。虽然已经提出了许多方法来解决具有大状态集的常规MDP,但很少有实用的方法,用于用大动作集解决受约束的MDP。特别是,我们表明,当奖励在调制中线性时,可以通过其极端点代替连续动作集。我们还开发了一个用于凹奖励功能的易旧的优化配方,并且令人惊讶的是,通过使用凹穴的信封也将其扩展到非凹面奖励功能。我们评估方法对贷款组合管理拖欠犯罪问题的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号