【24h】

Minimization of Variance on Controlled Markov Chain

机译:受控马尔可夫链上的方差最小化

获取原文
获取原文并翻译 | 示例

摘要

We consider a variance criterion on a finite-stage controlled Markov chain. The variance is a sample variance for the sequence of stage rewards (random variables) on the chain. Through a nonconventinal dynamic programming approach, we minimize the expected value of the variance over some large policy class. For computational simplicity, we take the variance multiplied by the square of the total number of stages. Our invariant imbedding method expands the original state space by one dimension. We illustrate a two-state, two-action and two-stage model by stochastic decision tree-table method. This optimal solution is obtained by solving the backward recursive equation for the minimum value functions on expanded state spaces.
机译:我们考虑有限级受控马尔可夫链上的方差准则。方差是链上阶段奖励序列(随机变量)的样本方差。通过非常规的动态规划方法,我们将某些较大的策略类别的方差的期望值最小化。为了简化计算,我们将方差乘以阶段总数的平方。我们的不变嵌入方法将原始状态空间扩展了一维。我们通过随机决策树表方法说明了一种两状态,两动作,两阶段模型。通过对展开状态空间上的最小值函数求解后向递归方程,可以得到最佳解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号