Kalman Based Finite State Controller for Partially Observable Domains

Alp Sardag; H. Levent Akin

首页> 外文期刊>International Journal of Advanced Robotic Systems >Kalman Based Finite State Controller for Partially Observable Domains

【24h】

Kalman Based Finite State Controller for Partially Observable Domains

机译：基于Kalman的部分可观察域的有限状态控制器。

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

A real world environment is often partially observable by the agents either because of noisy sensors or incomplete perception. Moreover, it has continuous state space in nature, and agents must decide on an action for each point in internal continuous belief space. Consequently, it is convenient to model this type of decision-making problems as Partially Observable Markov Decision Processes (POMDPs) with continuous observation and state space. Most of the POMDP methods whether approximate or exact assume that the underlying world dynamics or POMDP parameters such as transition and observation probabilities are known. However, for many real world environments it is very difficult if not impossible to obtain such information. We assume that only the internal dynamics of the agent, such as the actuator noise, interpretation of the sensor suite, are known. Using these internal dynamics, our algorithm, namely Kalman Based Finite State Controller (KBFSC), constructs an internal world model over the continuous belief space, represented by a finite state automaton. Constructed automaton nodes are points of the continuous belief space sharing a common best action and a common uncertainty level. KBFSC deals with continuous Gaussian-based POMDPs. It makes use of Kalman Filter for belief state estimation, which also is an efficient method to prune unvisited segments of the belief space and can foresee the reachable belief points approximately calculating the horizon N policy. KBFSC does not use an "explore and update" approach in the value calculation as TD-learning. Therefore KBFSC does not have an extensive exploration-exploitation phase. Using the MDP case reward and the internal dynamics of the agent, KBFSC can automatically construct the finite state automaton (FSA) representing the approximate optimal policy without the need for discretization of the state and observation space. Moreover, the policy always converges for POMDP problems.

机译：由于噪声传感器或不完整的感知，代理商通常可以部分观察到真实环境。而且，它本质上具有连续的状态空间，代理必须为内部连续的信念空间中的每个点决定一个动作。因此，将这种类型的决策问题建模为具有连续观察和状态空间的部分可观察的马尔可夫决策过程（POMDP）十分方便。大多数POMDP方法（无论是近似方法还是精确方法）都假定已知基本的世界动力学或POMDP参数（例如过渡和观测概率）。但是，对于许多现实环境，即使不是不可能，也很难获得这样的信息。我们假设仅了解代理的内部动态，例如执行器噪声，传感器套件的解释。利用这些内部动力学，我们的算法，即基于卡尔曼的有限状态控制器（KBFSC），在由有限状态自动机表示的连续置信空间上构造了一个内部世界模型。构造的自动机节点是连续信念空间的点，它们共享共同的最佳动作和共同的不确定性级别。 KBFSC处理基于高斯的连续POMDP。它利用卡尔曼滤波器进行置信状态估计，这也是一种修剪置信空间中未访问区域的有效方法，并且可以预见可计算出的置信点，近似地计算了水平N策略。 KBFSC在值计算中不使用“探索和更新”方法作为TD学习。因此，KBFSC没有广泛的勘探开发阶段。借助MDP案例奖励和代理的内部动力，KBFSC可以自动构造表示近似最优策略的有限状态自动机（FSA），而无需离散化状态和观察空间。而且，该策略总是收敛于POMDP问题。

著录项

来源
《International Journal of Advanced Robotic Systems》 |2006年第4期|共11页
作者
Alp Sardag; H. Levent Akin;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类机器人技术;
关键词
POMDP; Stochastic Control; Finite State Automata; Markov Decision Process;

机译：POMDP随机控制有限状态自动机马尔可夫决策过程;

相似文献

外文文献
中文文献
专利

1. Kalman Based Finite State Controller for Partially Observable Domains [J] . Alp Sardag, H. Levent Akin International Journal of Advanced Robotic Systems . 2006,第4期

机译：基于Kalman的部分可观察域的有限状态控制器。
2. Knowledge-based programs as succinct policies for partially observable domains [J] . Bruno Zanuttini, Jerome Lang, Abdallah Saffidine, Artificial intelligence . 2020,第Nova期

机译：基于知识的计划作为部分可观察域的简洁政策
3. Deep Q-Network with Predictive State Models in Partially Observable Domains [J] . Danning Yu, Kun Ni, Yunlong Liu Mathematical Problems in Engineering: Theory, Methods and Applications . 2020,第1期

机译：具有部分可观察域中的预测状态模型的深Q网络
4. Constraint-Based Controller Synthesis in Non-Deterministic and Partially Observable Domains [C] . Cedric Pralet, Gerard Verfaillie, Michel Lemaitre, European Conference on Artificial Intelligence . 2010

机译：基于约束的控制器合成在非确定性和部分可观察的域中
5. Automated hierarchy discovery for planning in partially observable domains [D] . Charlin, Laurent 2007

机译：在部分可观察的域中进行计划的自动化层次结构发现
6. Estimation of the Solow-Cobb-Douglas economic growth model with a Kalman filter: An observability-based approach [O] . Rodrigo Munguía, Jessica Davalos, Sarquis Urzua 2019

机译：带有卡尔曼滤波器的Solow-Cobb-Douglas经济增长模型的估计：基于可观察性的方法
7. Kalman Based Finite State Controller for Partially Observable Domains [O] . Alp Sardag, H. Levent Akin 2010

机译：基于Kalman的部分可观察域的有限状态控制器。

Kalman Based Finite State Controller for Partially Observable Domains

摘要

著录项

相似文献

相关主题

期刊订阅