【24h】

Minimization of Variance on Controlled Markov Chain

机译:控制马尔可夫链的差异最小化

获取原文

摘要

We consider a variance criterion on a finitestage controlled Markov chain. The variance is a sample variance for the sequence of stage rewards (random variables) on the chain. Through a nonconventional dynamic programming approach,w e minimize the expected value o the variance over some large policy class. For computational simplicity,w e take the variance multiplied by the square of the total number of stages. Our invariant imbedding method expands the original state space by one dimension. We illustrate a two-state, two-action and two-stage model by stochastic decision tree-table method. This optimal solution is obtained by solving the backward recursive equation for the minimum value functions on expanded state spaces.
机译:我们考虑在Finitestage控制的马尔可夫链上的差异标准。方差是链中级奖励序列(随机变量)的样本方差。通过非共同的动态编程方法,W e将预期值o最小化了一些大型策略类的方差。为了计算简单性,W E采取方差乘以总级数的平方。我们不变的嵌入式方法通过一个维度扩展原始状态空间。我们通过随机决策树木表方法说明了两州的双效和两级模型。通过求解扩展状态空间上的最小值函数的后向递归方程来获得该最佳解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号