首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Algorithmic Survey of Parametric Value Function Approximation
【24h】

Algorithmic Survey of Parametric Value Function Approximation

机译:参数值函数逼近的算法调查

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Reinforcement learning (RL) is a machine learning answer to the optimal control problem. It consists of learning an optimal control policy through interactions with the system to be controlled, the quality of this policy being quantified by the so-called value function. A recurrent subtopic of RL concerns computing an approximation of this value function when the system is too large for an exact representation. This survey reviews state-of-the-art methods for (parametric) value function approximation by grouping them into three main categories: bootstrapping, residual, and projected fixed-point approaches. Related algorithms are derived by considering one of the associated cost functions and a specific minimization method, generally a stochastic gradient descent or a recursive least-squares approach.
机译:强化学习(RL)是针对最佳控制问题的机器学习答案。它包括通过与要控制的系统进行交互来学习最佳控制策略,该策略的质量通过所谓的价值函数进行量化。 RL的子主题涉及当系统对于精确表示而言太大时,计算此值函数的近似值。本次调查将(参数)值函数逼近的最新方法归类为三个主要类别:自举,残差和投影定点方法。通过考虑相关的成本函数之一和特定的最小化方法(通常是随机梯度下降法或递归最小二乘法)来推导相关算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号