首页> 外文会议>Chinese Control Conference >Finite convergence of value iteration algorithm for discounted infinite horizon optimal control of stochastic logical systems
【24h】

Finite convergence of value iteration algorithm for discounted infinite horizon optimal control of stochastic logical systems

机译:随机逻辑系统的无穷大折扣最优控制的值迭代有限收敛算法

获取原文

摘要

This paper investigates the discounted infinite horizon optimal control problem for the stochastic multi-valued logical dynamical systems with finite states. After giving the equivalent descriptions of the stochastic logical dynamical system in terms of Markov decision process, the infinite horizon optimization problem is presented in an algebraic form. Based on the semi-tensor product of matrices and the increasing-dimension technique, it is proved that the optimal stationary policy is obtained by a finite horizon value iteration process, and an exact horizon length estimation for the finite horizon approach is derived. As an application, the optimization problem of Human-machine game is investigated.
机译:本文研究了具有有限状态的随机多值逻辑动力系统的无穷无限最优水平折扣控制问题。在根据马尔可夫决策过程给出了随机逻辑动力学系统的等效描述之后,无限代数优化问题以代数形式提出。基于矩阵的半张量积和增维技术,证明了通过有限水平值迭代过程获得了最优平稳策略,并推导了有限水平方法的精确水平长度估计。作为一种应用,研究了人机游戏的优化问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号