...
【24h】

Bellman residuals minimization using online support vector machines

机译:使用在线支持向量机器的贝尔曼残留最小化最小化

获取原文
获取原文并翻译 | 示例

摘要

In this paper we present and theoretically study an Approximate Policy Iteration (API) method called A P I - B R M (oee-) using a very effective implementation of incremental Support Vector Regression (SVR) to approximate the value function able to generalize Reinforcement Learning (RL) problems with continuous (or large) state space. A P I - B R M (oee-) is presented as a non-parametric regularization method based on an outcome of the Bellman Residual Minimization (BRM) able to minimize the variance of the problem. The proposed method can be cast as incremental and may be applied to the on-line agent interaction framework of RL. Being also based on SVR which are based on convex optimization, is able to find the global solution of the problem. A P I - B R M (oee-) using SVR can be seen as a regularization problem using oee--insensitive loss. Compared to standard squared loss also used in regularization, this allows to naturally build a sparse solution for the approximation function. We extensively analyze the statistical properties of A P I - B R M (oee-) founding a bound which controls the performance loss of the algorithm under some assumptions on the kernel and assuming that the collected samples are not-i.i.d. following a beta-mixing process. Some experimental evidence and performance for well known RL benchmarks are also presented.
机译:在本文中,我们在理论上和理论上地研究了一种近似的政策迭代(API)方法,称为API - BRM(OEE)的方法使用非常有效的增量支持向量回归(SVR)来近似能够概括增强学习的值功能(RL)连续(或大)状态空间的问题。基于Bellman剩余最小化(BRM)的结果,将P I-B R M(OEE-)作为非参数正规化方法呈现,能够最小化问题的变化。所提出的方法可以作为增量铸造,并且可以应用于R1的在线代理交互框架。还基于基于凸优化的SVR,能够找到问题的全局解决方案。使用SVR的P I-B R M(OEE-)可以使用OEE - 不敏感损耗视为正则化问题。与正规化中也使用的标准方形损耗相比,这允许自然地构建近似函数的稀疏解决方案。我们广泛地分析了创建绑定的P i - B r M(OEE-)的统计特性,该绑定在内核上的一些假设下控制算法的性能损失,并假设收集的样本不是-i.i.d。在β混合过程之后。还提出了一些众所周知的RL基准的实验证据和性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号