首页> 外文会议>AIAA infotech@aerospace conference and exhibit >A Lower Bounding Linear Programming approach to the Perimeter Patrol Stochastic Control Problem
【24h】

A Lower Bounding Linear Programming approach to the Perimeter Patrol Stochastic Control Problem

机译:周边巡逻随机控制问题的下限线性规划方法

获取原文

摘要

One encounters the curse of dimensionality in the application of dynamic programming to determine optimal policies for large scale controlled Markov chains. In this article, we consider a perimeter patrol stochastic optimal control problem. To determine the optimal control policy, one has to solve a Markov decision problem, whose large size renders exact dynamic programming methods intractable. So, we propose a state aggregation based approximate linear programming method to construct provably good sub-optimal policies instead. The state-space is partitioned and the optimal cost-to-go or value function is restricted to be a constant over each partition. We show that the resulting restricted system of linear inequalities embeds a family of Markov chains of lower dimension, one of which can be used to construct a tight lower bound on the optimal value function. In general, the construction of the lower bound requires the solution to a combinatorial problem. But the perimeter patrol problem exhibits a special structure that enables tractable linear programming formulation for the lower bound. We demonstrate this and also provide numerical results that corroborate the efficacy of the proposed methodology.
机译:在动态规划应用中,遇到维度的诅咒,以确定大规模控制的马尔可夫链的最佳政策。在本文中,我们考虑了一个周边巡逻随机最佳控制问题。要确定最佳控制策略,必须解决马尔可夫决策问题,其大尺寸呈现精确的动态编程方法难以解返。因此,我们提出了一种基于国家聚合的近似线性编程方法来构建可释放的良好的次优策略。状态空间被分区,并且最佳成本转到或value函数被限制为在每个分区上的常量。我们表明,由此产生的限制线性不平等系统嵌入了一系列Markov链条的下尺寸,其中一个可用于在最佳值函数上构造紧密的下限。通常,下限的构建需要解决组合问题。但周边巡逻问题表现出一种特殊的结构,使得能够为下限提供易于线性编程配方。我们证明了这一点,还提供了证实拟议方法的功效的数值结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号