Construction of embedded Markov decision processes for optimal control of non-linear systems with continuous state spaces

机译：具有连续状态空间的非线性系统最优控制的嵌入式马尔可夫决策过程的构造

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider the problem of constructing a suitable discrete-state approximation of an arbitrary non-linear dynamical system with continuous state space and discrete control actions that would allow close to optimal sequential control of that system by means of value or policy iteration on the approximated model. We propose a method for approximating the continuous dynamics by means of an embedded Markov decision process (MDP) model defined over an arbitrary set of discrete states sampled from the original continuous state space. The mathematical similarity between sets of barycentric coordinates (convex combinations) and probability mass functions is exploited to compute the transition matrices and initial state distribution of the MDP. Barycentric coordinates are computed efficiently on a Delaunay triangulation of the set of discrete states, ensuring maximal accuracy of the approximation and the resulting control policy.

机译：我们考虑以下问题：构造具有连续状态空间和离散控制动作的任意非线性动力学系统的合适离散状态近似，该离散离散控制动作将允许通过对近似模型进行值或策略迭代来接近该系统的最佳顺序控制。我们提出了一种方法，该方法通过在从原始连续状态空间采样的任意离散状态集上定义的嵌入式Markov决策过程（MDP）模型来逼近连续动力学。利用重心坐标集（凸组合）和概率质量函数之间的数学相似性来计算MDP的过渡矩阵和初始状态分布。重心坐标在一组离散状态的Delaunay三角剖分上得到了有效的计算，从而确保了逼近的最大准确性以及由此产生的控制策略。

著录项

来源
《Decision and Control and European Control Conference (CDC-ECC), 2011 50th IEEE Conference on》|2011年|p.7944-7949|共6页
会议地点 Orlando, FL(US)
作者
Nikovski, Daniel; Esenther, Alan;
展开▼
作者单位

Mitsubishi Electric Research Laboratories 201 Broadway Cambridge MA 02139 USA;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Markov decision process models; dynamic programming; embedded Markov chains; optimal control;

机译：马尔可夫决策过程模型；动态编程嵌入的马尔可夫链；最佳控制;

相似文献

外文文献
中文文献
专利

1. Bias and overtaking optimality for continuous-time jump Markov decision processes in polish spaces [J] . Zhu QX, Prieto-Rumeau T Journal of Applied Probability . 2008,第2期

机译：波兰空间中连续时间跳跃马尔可夫决策过程的偏差和超车最优
2. Average optimality inequality for continuous-time Markov decision processes in Polish spaces [J] . Quanxin Zhu Mathematical Methods of Operations Research . 2007,第2期

机译：波兰空间中连续时间马尔可夫决策过程的平均最优不等式
3. Average optimality for continuous-time Markov decision processes in Polish spaces [J] . Guo XP, Rieder U The Annals of applied probability: an official journal of the Institute of Mathematical Statistics . 2006,第2期

机译：波兰空间中连续时间马尔可夫决策过程的平均最优性
4. Discounted Optimality for Continuous-Time Markov Decision Processes in Polish Spaces [C] . Xianping Guo . 2006

机译：波兰空间中连续时间马尔可夫决策过程的折扣最优性
5. Real-time optimal control using subspace techniques for embedded systems with DSP implementations. [D] . Pongpairoj, Harin. 2004

机译：使用子空间技术的实时最佳控制，用于带有DSP实现的嵌入式系统。
6. Using model-based proposals for fast parameter inference on discrete state space continuous-time Markov processes [O] . C. M. Pooley, S. C. Bishop, G. Marion 2015

机译：使用基于模型的建议对离散状态空间连续时间马尔可夫过程进行快速参数推断
7. Construction of embedded Markov decision processes for optimal control of non-linear systems with continuous state spaces [O] . D. Esenther, Daniel Nikovski, Alan Esenther 2015

机译：具有连续状态空间的非线性系统最优控制的嵌入马尔可夫决策过程的构造
8. Theory for Semi-Markov Decision Processes with Unbounded Costs and Its Application to the Optimal Control of Queueing Systems. [R] . Orkenyi, P. 1976

机译：无界成本半马尔可夫决策过程理论及其在排队系统最优控制中的应用。

Construction of embedded Markov decision processes for optimal control of non-linear systems with continuous state spaces

摘要

著录项

相似文献

相关主题

期刊订阅