首页> 外文期刊>Journal of economic theory >Ambiguous partially observable Markov decision processes: Structural results and applications
【24h】

Ambiguous partially observable Markov decision processes: Structural results and applications

机译:部分可观察的模棱两可的马尔可夫决策过程:结构结果和应用

获取原文
获取原文并翻译 | 示例
           

摘要

Markov Decision Processes (MDPs) have been widely used as invaluable tools in dynamic decision making, which is a central concern for economic agents operating at both the micro and macro levels. Often the decision maker's information about the state is incomplete; hence, the generalization to Partially Observable MDPs (POMDPs). Unfortunately, POMDPs may require a large state and/or action space, creating the well-known "curse of dimensionality." However, recent computational contributions and blindingly fast computers have helped to dispel this curse. This paper introduces and addresses a second curse termed "curse of ambiguity," which refers to the fact that the exact transition probabilities are often hard to quantify, and are rather ambiguous. For instance, for a monetary authority concerned with dynamically setting the inflation rate so as to control the unemployment, the dynamics of unemployment rate under any given inflation rate is often ambiguous. Similarly, in worker-job matching, the dynamics of worker-job match/proficiency level is typically ambiguous. This paper addresses the "curse of ambiguity" by developing a generalization of POMDPs termed Ambiguous POMDPs (APOMDPs), which not only allows the decision maker to take into account imperfect state information, but also tackles the inevitable ambiguity with respect to the correct probabilistic model of transitions.
机译:马尔可夫决策过程(MDP)已被广泛用作动态决策中的宝贵工具,这是在微观和宏观两个层面上运作的经济主体的主要关注点。决策者关于状态的信息通常是不完整的。因此,可以推广到部分可观察的MDP(POMDP)。不幸的是,POMDP可能需要较大的状态和/或动作空间,从而形成了众所周知的“维数诅咒”。但是,最近的计算贡献和令人眼花fast乱的快速计算机有助于消除这种诅咒。本文介绍并讨论了第二个诅咒,称为“歧义诅咒”,它指的是这样一个事实,即确切的转移概率通常难以量化,并且相当模糊。例如,对于涉及动态设置通货膨胀率以控制失业的货币当局,在任何给定的通货膨胀率下,失业率的动态通常是模棱两可的。类似地,在工人-职位匹配中,工人-职位匹配/熟练程度的动态通常是模棱两可的。本文通过开发称为歧义POMDP(APOMDP)的POMDP的泛化来解决“歧义的诅咒”,它不仅允许决策者考虑不完善的状态信息,而且还针对正确的方法解决了不可避免的歧义。转移的概率模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号