首页> 外文期刊>Computational Optimization and Applications >Efficient sampling in approximate dynamic programming algorithms
【24h】

Efficient sampling in approximate dynamic programming algorithms

机译:近似动态编程算法中的有效采样

获取原文
获取原文并翻译 | 示例

摘要

Dynamic Programming (DP) is known to be a standard optimization tool for solving Stochastic Optimal Control (SOC) problems, either over a finite or an infinite horizon of stages. Under very general assumptions, commonly employed numerical algorithms are based on approximations of the cost-to-go functions, by means of suitable parametric models built from a set of sampling points in the d-dimensional state space. Here the problem of sample complexity, i.e., how “fast” the number of points must grow with the input dimension in order to have an accurate estimate of the cost-to-go functions in typical DP approaches such as value iteration and policy iteration, is discussed. It is shown that a choice of the sampling based on low-discrepancy sequences, commonly used for efficient numerical integration, permits to achieve, under suitable hypotheses, an almost linear sample complexity, thus contributing to mitigate the curse of dimensionality of the approximate DP procedure.
机译:众所周知,动态编程(DP)是解决阶段有限或无限阶段的随机最优控制(SOC)问题的标准优化工具。在非常笼统的假设下,常用的数值算法是基于成本函数的近似值,借助于从d维状态空间中的一组采样点构建的合适参数模型。这里的样本复杂性问题,即,点数必须随输入维数“快速”增长,以便对诸如价值迭代和策略迭代之类的典型DP方法中的成本函数进行准确的估算,讨论。结果表明,基于低差异序列的采样选择通常用于有效的数值积分,可以在适当的假设下实现几乎线性的样本复杂度,从而有助于减轻近似DP程序的维数诅咒。 。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号