The mirror descent control algorithm for weakly regular homogeneous finite Markov chains with unknown mean losses

机译：均值未知的弱规则齐次有限Markov链的镜像下降控制算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We address the adaptive stochastic control problem for a discrete time system described by controlled Markov chain with finite number of states. The mirror descent randomized control algorithm on the class of controlled homogeneous finite Markov chains with unknown mean losses has been proposed and studied. Here we develop the approach represented in Nazin and Miller (2011). The main assumptions are the following: processes are independent and stationary, nonnegative random losses are almost surely bounded by a given constant, and the connectivity assumption for the controlled Markov chain holds. The uncertainty is that the mean loss matrix is unknown. The novelty of the approach is in extension of the class of controlled homogeneous finite Markov chains to the chains with connectivity assumption. The main result consists in demonstration of the asymptotical upper bound (that is asymptotic by time) and in determining the explicit constant which is weakly depending on the logarithm of the number of states.

机译：我们解决了由有限状态数的受控马尔可夫链描述的离散时间系统的自适应随机控制问题。提出并研究了平均损失未知的受控齐次有限马尔可夫链上的镜像下降随机控制算法。在这里，我们开发以Nazin和Miller（2011）表示的方法。主要假设如下：过程是独立且稳定的，非负随机损失几乎可以确定地由给定常数限制，并且受控马尔可夫链的连通性假设成立。不确定性是平均损失矩阵未知。该方法的新颖之处在于将受控齐次有限Markov链的类别扩展到具有连通性假设的链。主要结果在于证明渐近上限（随时间渐近）和确定显式常数，该常数弱取决于状态数的对数。

著录项

来源
《Decision and Control and European Control Conference (CDC-ECC), 2011 50th IEEE Conference on》|2011年|p.1779-1783|共5页
会议地点 Orlando, FL(US)
作者
Nazin, Alexander V.; Miller, Boris;
展开▼
作者单位

Laboratory for Adaptive and Robust Control Systems Institute of Control Sciences RAS 65 Profsoyuznaya str. 117997 Moscow Russia;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
入库时间 2022-08-26 14:34:14

相似文献

外文文献
中文文献
专利

1. A dynamic programming approach for finite Markov processes and algorithms for the calculation of the limit matrix in Markov chains [J] . Stefan Pickl, Dmitrii Lozovanu Optimization: A Journal of Mathematical Programming and Operations Research . 2011,第10a12期

机译：有限马尔可夫过程的动态规划方法和马尔可夫链中极限矩阵的计算算法
2. WEAK AND UNIFORM WEAK Δ-ERGODICITY FOR [Δ]-GROUPABLE FINITE MARKOV CHAINS [J] . UDREA PAUN Mathematical Reports . 2004,第3期

机译：[Δ]-可控有限马尔可夫链的弱和一致弱Δ-易解性
3. A Mirror Descent Algorithm for Minimization of Mean Poisson Flow Driven Losses [J] . A. V. Nazin, S. V. Anulova, A. A. Tremba Automation and Remote Control . 2014,第6期

机译：用于最小化平均泊松流驱动损失的镜像下降算法
4. Mirror Descent Algorithm for Homogeneous Finite Controlled Markov Chains with Unknown Mean Losses [C] . Alexander V. Nazin, Boris M. Miller IFAC World Congress . 2011

机译：均匀有限控制的Markov链条镜像下降算法，具有未知平均损失
5. Finite Difference and Markov Chain Approximation for Option Pricing and Hedging: Algorithms and Error Analysis [D] . Zhang, Gongqiu. 2017

机译：期权定价和对冲的有限差分和马尔可夫链近似：算法和误差分析
6. KULLBACK-LEIBLER MARKOV CHAIN MONTE CARLO — A NEW ALGORITHM FOR FINITE MIXTURE ANALYSIS AND ITS APPLICATION TO GENE EXPRESSION DATA [O] . TATIANA TATARINOVA, JOHN BOUCK, ALAN SCHUMITZKY -1

机译：KULLBACK-LEIBLER MARKOV CHAIN MONTE CARLO —有限混合分析的新算法及其在基因表达数据中的应用
7. Weak law of large numbers for some Markov chains along non homogeneous genealogies [O] . Vincent Bansaye, Chunmao Huang 2015

机译：一些Markov链沿非均质系族的大量法则
8. Row-Continuous Finite Markov Chains, Structure and Algorithms [R] . Keilson, J., Sumita, U., Zachmann, M. 1981

机译：行连续有限马氏链，结构和算法

The mirror descent control algorithm for weakly regular homogeneous finite Markov chains with unknown mean losses

摘要

著录项

相似文献

相关主题

期刊订阅