首页> 外文会议>Conference on uncertainty in artificial intelligence >Value Function Approximation in Noisy Environments Using Locally Smoothed Regularized Approximate Linear Programs

【24h】

Value Function Approximation in Noisy Environments Using Locally Smoothed Regularized Approximate Linear Programs

机译：使用局部平滑的正则化近似线性程序在嘈杂环境中进行值函数逼近

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recently, Petrik et al. demonstrated that Li-Regularized Approximate Linear Programming (RALP) could produce value functions and policies which compared favorably to established linear value function approximation techniques like LSPI. RALP's success primarily stems from the ability to solve the feature selection and value function approximation steps simultaneously. RALP's performance guarantees become looser if sampled next states are used. For very noisy domains, RALP requires an accurate model rather than samples, which can be unrealistic in some practical scenarios. In this paper, we demonstrate this weakness, and then introduce Locally Smoothed L_1 -Regularized Approximate Linear Programming (LS-RALP). We demonstrate that LS-RALP mitigates inaccuracies stemming from noise even without an accurate model. We show that, given some smoothness assumptions, as the number of samples increases, error from noise approaches zero, and provide experimental examples of LS-RALP's success on common reinforcement learning benchmark problems.

机译：最近，Petrik等人。证明了Li-Regularized近似线性规划（RALP）可以产生值函数和策略，与建立的线性值函数逼近技术（如LSPI）相比具有优势。 RALP的成功主要源于同时解决特征选择和值函数逼近步骤的能力。如果使用采样的下一个状态，则RALP的性能保证会变得更加宽松。对于噪声很大的域，RALP需要一个准确的模型而不是样本，这在某些实际情况下可能是不现实的。在本文中，我们证明了这一弱点，然后介绍了局部平滑的L_1-正则化近似线性规划（LS-RALP）。我们证明了LS-RALP即使没有精确的模型也可以缓解由于噪声引起的误差。我们表明，在给定一些平滑度假设的情况下，随着样本数量的增加，噪声误差接近零，并提供了LS-RALP成功解决常见强化学习基准问题的实验示例。

著录项

来源
《Conference on uncertainty in artificial intelligence 》|2012年|835-842|共8页
会议地点
作者
Gavin Taylor; Ronald Parr;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Robust Approximate Bilinear Programming for Value Function Approximation [J] . Petrik Marek, Zilberstein Shlomo Journal of machine learning research . 2011 ,第Oct期

机译：值函数逼近的鲁棒近似双线性规划
2. Corrigendum to “An adaptive regularized smoothed ℓ~0 norm algorithm for sparse signal recovery in noisy environments” (Signal Processing (2017) 135 (153-157)(S016516841730004X)(10.1016/j.sigpro.2017.01.004)) [J] . Chen Jinli, Zhou Yun, Jin Lin, Signal processing . 2017 ,第deca期

机译：“在噪声环境中稀疏信号恢复的自适应正则化平滑ℓ〜0范数算法”勘误（信号处理（2017）135（153-157）（S016516841730004X）（10.1016 / j.sigpro.2017.01.004））
3. An adaptive regularized smoothed ℓ° norm algorithm for sparse signal recovery in noisy environments [J] . Jinli Chen, Yun Zhou, Lin Jin, Signal processing . 2017 ,第juna期

机译：噪声环境下稀疏信号恢复的自适应正则平滑ℓ°范数算法
4. Value Function Approximation in Noisy Environments Using Locally Smoothed Regularized Approximate Linear Programs [C] . Gavin Taylor, Ronald Parr Conference on Uncertainty in Artificial Intelligence . 2012

机译：使用本地平滑的正则近似线性程序的嘈杂环境中的值函数近似
5. Optimization of Smoothed Functionals and Applications of Nonlinear Programming to Fastest Path Finding for Vehicles in Anisotropic Media. [D] . Maggiar, Alvaro. 2014

机译：各向异性介质中车辆的平滑功能优化和非线性规划在车辆最快路径查找中的应用。
6. LinearFold: linear-time approximate RNA folding by 5-to-3 dynamic programming and beam search [O] . Liang Huang, He Zhang, Dezhong Deng, -1

机译：LinearFold：通过5至3动态编程和束搜索实现线性时间近似RNA折叠
7. An Analysis of State-Relevance Weights and Sampling Distributions on L1-Regularized Approximate Linear Programming Approximation Accuracy [O] . Taylor, Gavin, Geer, Connor, Piekut, David 2014

机译：关于状态相关权重和抽样分布的分析 L1正则化近似线性规划逼近精度

Value Function Approximation in Noisy Environments Using Locally Smoothed Regularized Approximate Linear Programs

摘要

著录项

相似文献

相关主题

期刊订阅