首页> 外文会议>AIAA infotech@aerospace conference and exhibit >A Lower Bounding Linear Programming approach to the Perimeter Patrol Stochastic Control Problem
【24h】

A Lower Bounding Linear Programming approach to the Perimeter Patrol Stochastic Control Problem

机译:边界巡逻随机控制问题的下界线性规划方法

获取原文

摘要

One encounters the curse of dimensionality in the application of dynamic programming to determine optimal policies for large scale controlled Markov chains. In this article, we consider a perimeter patrol stochastic optimal control problem. To determine the optimal control policy, one has to solve a Markov decision problem, whose large size renders exact dynamic programming methods intractable. So, we propose a state aggregation based approximate linear programming method to construct provably good sub-optimal policies instead. The state-space is partitioned and the optimal cost-to-go or value function is restricted to be a constant over each partition. We show that the resulting restricted system of linear inequalities embeds a family of Markov chains of lower dimension, one of which can be used to construct a tight lower bound on the optimal value function. In general, the construction of the lower bound requires the solution to a combinatorial problem. But the perimeter patrol problem exhibits a special structure that enables tractable linear programming formulation for the lower bound. We demonstrate this and also provide numerical results that corroborate the efficacy of the proposed methodology.
机译:人们在应用动态规划来确定大规模受控马尔可夫链的最优策略时遇到了维数的诅咒。在本文中,我们考虑了周边巡逻随机最优控制问题。为了确定最佳控制策略,必须解决马尔可夫决策问题,该问题的大尺寸使得精确的动态编程方法难以处理。因此,我们提出了一种基于状态聚集的近似线性规划方法来构造可证明良好的次优策略。对状态空间进行了分区,并且将最佳成本或价值函数限制为每个分区上的常量。我们表明,由此产生的线性不等式的受限系统嵌入了较低维的马尔可夫链族,其中之一可用于构造最优值函数上的紧下界。通常,下界的构造需要解决组合问题。但是周界巡逻问题表现出一种特殊的结构,可以为下限制定易于处理的线性规划公式。我们证明了这一点,并提供了数值结果,证实了所提出方法的有效性。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号