首页> 外文会议>International Conference on Application of Information and Communication Technologies >Learning partially observable Markov decision model with EM algorithm

【24h】

Learning partially observable Markov decision model with EM algorithm

机译：利用EM算法学习部分观察到的马尔可夫决策模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Most of existing researches focus on POMDP modeling or solution. But in some study fields, before obtaining optimal policy from a POMDP, we need first learning a POMDP model from history data. Assumed that history data including observation sequence and action sequence, the state sequence are unobservable, we derive necessary formulas for using EM Algorithm to estimate the parameters of a POMDP model, including the initial state distribution, stochastic transition matrix and observation probability function.

机译：大多数现有研究专注于POMDP建模或解决方案。但在某些研究领域，在从POMDP获得最佳策略之前，我们需要首先从历史数据学习POMDP模型。假设历史数据包括观察序列和动作序列，状态序列是不可观察的，我们推导出用于使用EM算法来估计POMDP模型的参数的必要公式，包括初始状态分布，随机转换矩阵和观察概率函数。

著录项

来源
《International Conference on Application of Information and Communication Technologies 》|2013年||共4页
会议地点
作者
Hui Tan; Shaohui Ma;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信 ;
关键词
EM Algorithm; HMM; POMDP Model;

机译：EM算法;嗯;POMDP模型;

相似文献

外文文献
中文文献
专利

1. A Pulse Neural Network Reinforcement Learning Algorithm for Partially Observable Markov Decision Processes [J] . Koichiro Takita, Masafumi Hagiwara Systems and Computers in Japan . 2005 ,第3期

机译：部分可观察的马尔可夫决策过程的脉冲神经网络强化学习算法
2. A pulse neural network reinforcement learning algorithm for partially observable Markov decision process [J] . Koichiro Takita, Masafumi Hagiwara 電子情報通信学会技術研究報告. ニュ-ロコンピュ-ティング. Neurocomputing . 2001 ,第735期

机译：局部可观察马尔可夫决策过程的脉冲神经网络强化学习算法
3. A pulse neural network reinforcement learning algorithm for partially observable Markov decision process [J] . Koichiro Takita, Masafumi Hagiwara 電子情報通信学会技術研究報告. ニュ-ロコンピュ-ティング. Neurocomputing . 2001 ,第735期

机译：一种脉冲神经网络加固学习算法，用于部分观察到的马尔可夫决策过程
4. Learning partially observable Markov decision model with EM algorithm [C] . Hui Tan, Shaohui Ma 7th International Conference on Application of Information and Communication Technologies . 2013

机译：用EM算法学习部分可观测的马尔可夫决策模型。
5. Increasing scalability in algorithms for centralized and decentralized partially observable Markov decision processes: Efficient decision-making and coordination in uncertain environments. [D] . Amato, Christopher. 2010

机译：用于集中式和分散式部分可观察的马尔可夫决策过程的算法中的可伸缩性不断增强：在不确定的环境中进行有效的决策和协调。
6. Modeling treatment of ischemic heart disease with partially observable Markov decision processes. [O] . M. Hauskrecht, H. Fraser 1998

机译：使用局部可观察的马尔可夫决策过程对缺血性心脏病的治疗进行建模。
7. State of the Art---A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms [O] . George E. Monahan 1982

机译：最先进的技术---部分可观察的马尔可夫决策过程的调查：理论，模型和算法
8. Cooperation and Coordination Between Fuzzy Reinforcement Learning Agents in Continuous State Partially Observable Markov Decision Processes [R] . Berenji, Hamid R., Vengerov, David 1999

机译：连续状态部分可观测马尔可夫决策过程中模糊强化学习agent的协作与协调

Learning partially observable Markov decision model with EM algorithm

摘要

著录项

相似文献

相关主题

期刊订阅