Least Absolute Policy Iteration-A Robust Approach to Value Function Approximation

Masashi SUGIYAMA; Hirotaka HACHIYA; Hisashi KASHIMA; Tetsuro MORIMURA

首页> 外文期刊>IEICE Transactions on Information and Systems >Least Absolute Policy Iteration-A Robust Approach to Value Function Approximation

【24h】

Least Absolute Policy Iteration-A Robust Approach to Value Function Approximation

机译：最小绝对策略迭代-价值函数逼近的稳健方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efficiency. However, it tends to be sensitive to outliers in observed rewards. In this paper, we propose an alternative method that employs the absolute loss for enhancing robustness and reliability. The proposed method is formulated as a linear programming problem which can be solved efficiently by standard optimization software, so the computational advantage is not sacrificed for gaining robustness and reliability. We demonstrate the usefulness of the proposed approach through a simulated robot-control task.

机译：最小二乘策略迭代由于其计算效率高而在机器人技术中是一种有用的强化学习方法。但是，它倾向于对观察到的奖励中的异常值敏感。在本文中，我们提出了一种替代方法，该方法采用绝对损耗来增强鲁棒性和可靠性。提出的方法被公式化为一个线性规划问题，可以通过标准优化软件有效地解决，因此不会牺牲计算优势来获得鲁棒性和可靠性。我们通过模拟机器人控制任务证明了所提出方法的有用性。

著录项

来源
《IEICE Transactions on Information and Systems》 |2010年第9期|P.2555-2565|共11页
作者
Masashi SUGIYAMA; Hirotaka HACHIYA; Hisashi KASHIMA; Tetsuro MORIMURA;
展开▼
作者单位

Department of Computer Science, Tokyo Institute of Technology, Tokyo, 152-8552 Japan PRESTO, Japan Science and Technology Agency, Tokyo, 152-8552 Japan;

rnDepartment of Computer Science, Tokyo Institute of Technology, Tokyo, 152-8552 Japan;

rnDepartment of Mathematical Informatics, the University of Tokyo, Tokyo, 113-8656 Japan;

rnIBM Research - Tokyo, Yamato-shi, 242- 8502 Japan;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
reinforcement learning; value function approximation; least- squares policy iteration; outlier; l-loss function; linear programming;

机译：强化学习;值函数近似;最小二乘策略迭代;离群值l损失功能;线性规划;
入库时间 2022-08-18 00:27:00

相似文献

外文文献
中文文献
专利

1. Least Absolute Policy Iteration — A Robust Approach to Value Function Approximation [J] . Masashi SUGIYAMA, Hirotaka HACHIYA, Hisashi KASHIMA, IEICE transactions on information and systems . 2010,第9期

机译：最小绝对策略迭代-价值函数逼近的稳健方法
2. APPROACH TO NON-ABSOLUTE INTEGRATION BY SUCCESSIVE APPROXIMATIONS [J] . Shizu Nakanishi Scientiae mathematicae Japonicae . 2001,第2期

机译：逐次逼近的非绝对积分方法
3. A Class of Incomplete Riemann Solvers Based on Uniform Rational Approximations to the Absolute Value Function [J] . Manuel J. Castro, Jose M. Gallardo, Antonio Marquina Journal of Scientific Computing . 2014,第2期

机译：基于绝对值函数的有理逼近的一类不完全Riemann解。
4. Least absolute policy iteration for robust value function approximation [C] . Sugiyama, Masashi, Hachiya, Hirotaka, Kashima, Hisashi, IEEE International Conference on Robotics and Automation;ICRA '09 . 2009

机译：最小绝对策略迭代，用于鲁棒值函数逼近
5. Analyzing reproducing kernel approximation methods via a Green function approach [D] . Ye, Qi 2012

机译：通过Green函数方法分析再现核逼近方法
6. Function approximation approach to the inference of reduced NGnet models of genetic networks [O] . Shuhei Kimura, Katsuki Sonoda, Soichiro Yamane, 2008

机译：遗传网络约简NGnet模型推论的函数逼近方法
7. Least absolute policy iteration for robust value function approximation [O] . Masashi Sugiyama, Hirotaka Hachiya, Hisashi Kashima, 2009

机译：鲁棒值函数逼近的最小绝对策略迭代
8. Rational Approximations of Transfer Functions of Some Viscoelastic Rods, with Applications to Robust Control. [R] . Hannsgen, K. B., Staffans, O. J., Wheeler, R. L. 1992

机译：一些粘弹性杆传递函数的有理逼近及其在鲁棒控制中的应用。

Least Absolute Policy Iteration-A Robust Approach to Value Function Approximation

摘要

著录项

相似文献

相关主题

期刊订阅