首页> 外文会议>International joint conference on artificial intelligence >Equivalence Relations in Fully and Partially Observable Markov Decision Processes
【24h】

Equivalence Relations in Fully and Partially Observable Markov Decision Processes

机译:完全和部分可观察的马尔可夫决策过程中的等价关系

获取原文

摘要

We explore equivalence relations between states in Markov Decision Processes and Partially Observable Markov Decision Processes. We focus on two different equivalence notions: bisimulation [Givan et al, 2003] and a notion of trace equivalence, under which states are considered equivalent if they generate the same conditional probability distributions over observation sequences (where the conditioning is on action sequences). We show that the relationship between these two equivalence notions changes depending on the amount and nature of the partial observability. We also present an alternate characterization of bisimulation based on trajectory equivalence.
机译:我们探讨了马尔可夫决策过程和部分可观察到的马尔可夫决策过程的国家之间的等价关系。我们专注于两个不同的等价概念:Bisimulation [Givan等,2003]和痕量等价的概念,如果它们产生相同的条件概率分布在观察序列上(调节在动作序列上的情况下),则在该迹象中被认为是相同的。我们表明,这两个等价概念之间的关系根据部分可观察性的量和性质而变化。我们还基于轨迹等价呈现了双刺激的替代表征。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号