Regularized Fitted Q-Iteration: Application to Planning

机译：正规化的Q迭代：规划申请

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider planning in a Markovian decision problem, i.e., the problem of finding a good policy given access to a generative model of the environment. We propose to use fitted Q-iteration with penalized (or regularized) least-squares regression as the regression subroutine to address the problem of controlling model-complexity. The algorithm is presented in detail for the case when the function space is a reproducing-kernel Hilbert space underlying a user-chosen kernel function. We derive bounds on the quality of the solution and argue that data-dependent penalties can lead to almost optimal performance. A simple example is used to illustrate the benefits of using a penalized procedure.

机译：我们考虑在Markovian决策问题中规划，即，找到良好政策的问题，了解了对环境的生成模型的良好政策。我们建议使用罚款（或正常化）最小二乘回归作为回归子程序来使用罚款，以解决控制模型复杂性的问题。当函数空间是用户选择的内核函数的再现 - 内核Hilbert空间时，详细介绍了算法。我们获得了解决方案质量的界限，并争辩说数据相关的惩罚可能会导致几乎最佳的性能。使用一个简单的例子来说明使用惩罚程序的好处。

著录项

来源
《European Workshop on Reinforcement Learning》|2008年||共14页
会议地点
作者
Amir Massoud Farahmand; Mohammad Ghavamzadeh; Csaba Szepesvari; Shie Mannor;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词
入库时间 2022-08-20 21:22:00

相似文献

外文文献
中文文献
专利

1. Scenario-based fitted Q-iteration for adaptive control of water reservoir systems under uncertainty [J] . Federica Bertoni, Matteo Giuliani, Andrea Castelletti IFAC PapersOnLine . 2017,第1期

机译：基于场景的拟合Q迭代用于不确定性条件下水库系统的自适应控制
2. Boosted Fitted Q-Iteration [J] . Samuele Tosatto, Matteo Pirotta, Carlo D’Eramo, JMLR: Workshop and Conference Proceedings . 2017,第1期

机译：增强拟合Q迭代
3. Boosted Fitted Q-Iteration [J] . Samuele Tosatto, Matteo Pirotta, Carlo D’Eramo, JMLR: Workshop and Conference Proceedings . 2017,第2009期

机译：增强拟合Q迭代
4. Regularized Fitted Q-Iteration: Application to Planning [C] . Amir massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvari, Recent advances in reinforcement learning . 2008

机译：正则拟合Q迭代：在规划中的应用
5. Implementation and applications of density-fitted symmetry-adapted perturbation theory. [D] . Hohenstein, Edward G. 2011

机译：密度拟合对称自适应扰动理论的实现与应用。
6. An Interface-Fitted Finite Element Level Set Method with Application to Solidification and Solvation [O] . Bo Li, John Shopple -1

机译：与应用凝固和溶剂化的接口配有限元水平集方法
7. Regularized Fitted Q-iteration: Application to Planning [O] . Amir Massoud Farahm, Mohammad Ghavamzadeh, Csaba Szepesvári, 2009

机译：正则拟合Q迭代：应用于规划
8. BODYFIT-2PE-HEM: LWR Core Thermal-Hydraulic Code Using Boundary-Fitted Coordinates and Two-Phase Homogeneous Equilibrium Model. Volume 3: Validation and Applications [R] . Chen, B. C. J., Chien, T. H., Sha, W. T. 1985

机译：BODYFIT-2pE-HEm：使用边界拟合坐标和两相均匀平衡模型的LWR核心热工水力学代码。第3卷：验证和应用

Regularized Fitted Q-Iteration: Application to Planning

摘要

著录项

相似文献

相关主题

期刊订阅