首页> 外文会议>International Conference on Application of Information and Communication Technologies >Learning partially observable Markov decision model with EM algorithm
【24h】

Learning partially observable Markov decision model with EM algorithm

机译:利用EM算法学习部分观察到的马尔可夫决策模型

获取原文

摘要

Most of existing researches focus on POMDP modeling or solution. But in some study fields, before obtaining optimal policy from a POMDP, we need first learning a POMDP model from history data. Assumed that history data including observation sequence and action sequence, the state sequence are unobservable, we derive necessary formulas for using EM Algorithm to estimate the parameters of a POMDP model, including the initial state distribution, stochastic transition matrix and observation probability function.
机译:大多数现有研究专注于POMDP建模或解决方案。但在某些研究领域,在从POMDP获得最佳策略之前,我们需要首先从历史数据学习POMDP模型。假设历史数据包括观察序列和动作序列,状态序列是不可观察的,我们推导出用于使用EM算法来估计POMDP模型的参数的必要公式,包括初始状态分布,随机转换矩阵和观察概率函数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号