Finite Sample Analyses for TD(0) with Function Approximation

机译：具有功能近似的TD（0）的有限样本分析

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

TD(0) is one of the most commonly used algorithms in reinforcement learning. Despite this, there is no existing finite sample analysis for TD(0) with function approximation, even for the linear case. Our work is the first to provide such results. Works that managed to obtain convergence rates for online Temporal Difference (TD) methods analyzed somewhat modified versions of them that include projections and step-size dependent on unknown problem parameters. Our analysis obviates these artificial alterations by exploiting strong properties of TD(0). We provide convergence rates both in expectation and with high-probability. Both are based on relatively unknown, recently developed stochastic approximation techniques.

机译：TD（0）是增强学习中最常用的算法之一。尽管如此，即使对于线性情况，也没有具有功能近似的TD（0）的现有有限样本分析。我们的工作是第一个提供此类结果的工作。用于获得在线时间差异（TD）方法的收敛速率的作品分析了它们的一些修改版本，其中包括取决于未知问题参数的预测和阶梯大小。我们的分析通过利用TD（0）的强大性质来消除这些人工改变。我们提供期望和高概率的融合率。两者都是基于相对未知的，最近开发的随机近似技术。

著录项

来源
《AAAI Conference on Artificial Intelligence;Innovative Applications of Artificial Intelligence Conference;Symposium on Educational Advances in Artificial Intelligence》|2018年|5746-6664p|共18页
会议地点
作者
Gal Dalal; Balazs Szorenyi; Gugan Thoppe; Shie Mannor;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词

相似文献

外文文献
中文文献
专利

1. Finite-sample Analysis of Greedy-GQ with Linear Function Approximation under Markovian Noise [J] . Yue Wang, Shaofeng Zou JMLR: Workshop and Conference Proceedings . 2020,第2010期

机译：马尔科夫噪声下线性函数近似的贪婪-GQ有限样本分析
2. Two-dimensional dispersion analyses of finite element approximations to the shallow water equations [J] . J. H. Atkinson, J. J. Westerink, R. A. Luettich Jr International Journal for Numerical Methods in Fluids . 2004,第7期

机译：浅水方程组有限元近似的二维色散分析
3. Two-dimensional dispersion analyses of finite element approximations to the shallow water equations [J] . J. H. Atkinson, J. J. Westerink, R. A. Luettich Jr International Journal for Numerical Methods in Fluids . 2004,第7期

机译：浅水方程组有限元近似的二维色散分析
4. Finite Sample Analyses for TD(0) with Function Approximation [C] . Gal Dalal, Balazs Szorenyi, Gugan Thoppe, AAAI Conference on Artificial Intelligence;Innovative Applications of Artificial Intelligence Conference;Symposium on Educational Advances in Artificial Intelligence . 2018

机译：具有功能近似的TD（0）的有限样本分析
5. Optimal Sampling for Linear Function Approximation and High-Order Finite Difference Methods over Complex Regions [D] . Liu, Tony. 2019

机译：在复杂区域的线性函数近似和高阶有限差分方法的最佳采样
6. Accelerating functional MRI using fixed‐rank approximations and radial‐cartesian sampling [O] . Mark Chiew, Nadine N. Graedel, Jennifer A. McNab, -1

机译：使用固定秩近似和径向笛卡尔采样加速功能性MRI
7. Many-Body Perturbation Theory (MBPT) and Time-Dependent Density-Functional Theory (TD-DFT): MBPT Insights About What is Missing in, and Corrections to, the TD-DFT Adiabatic Approximation [O] . Huix-Rotllant, Miquel, Casida, Mark E. 2015

机译：多体微扰理论（mBpT）和时间依赖密度泛函理论（TD-DFT）：mBpT洞察失踪的原因，和校正，TD-DFT绝热近似
8. Combining Comparison Functions and Finite Element Approximations in CFD [R] . Baumeister, K. J., Baumeister, J. F. 1995

机译：CFD中比较函数与有限元逼近的结合

Finite Sample Analyses for TD(0) with Function Approximation

摘要

著录项

相似文献

相关主题

期刊订阅