首页> 外文会议>Chinese Control and Decision Conference >Least-Squares Temporal Difference Learning with Eligibility Traces based on Regularized Extreme Learning Machine

【24h】

Least-Squares Temporal Difference Learning with Eligibility Traces based on Regularized Extreme Learning Machine

机译：基于正规化的极端学习机的资格迹线，最小二乘时间差异学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The task of learning the value function under a fixed policy in continuous Markov decision processes (MDPs) is considered. Although ELM has fast learning speed and can avoid tuning issues of traditional artificial neural network (ANN), the randomness of the ELM parameters would result in fluctuating performance. In this paper, a least-squares temporal difference algorithm with eligibility traces based on regularized extreme learning machine (RELM-LSTD(λ)) is proposed to overcome these problems caused by ELM in Reinforcement Learning problem. The proposed algorithm combined the LSTD(λ) algorithm with RELM. The RELM is used to approximate value functions. Furthermore, the eligibility trace term is introduced to increase data efficiency. In experiments, the performances of the proposed algorithm are demonstrated and compared with those of LSTD and ELM-LSTD. Experiment results show that the proposed algorithm can achieve a more stable and better performance in approximating the value function under a fixed policy.

机译：学习下连续马尔可夫决策过程（MDP中）固定策略的值函数的任务是考虑。虽然ELM具有学习速度快，可避免传统人工神经网络（ANN）的调整问题，榆树参数的随机性会导致波动的表现。在本文中，基于正则极端学习机的资格痕迹最小二乘时间差算法（RELM-LSTD（λ）），提出了克服强化学习问题引起的ELM这些问题。所提出的算法结合RELM的LSTD（λ）算法。所述RELM用于近似值的函数。此外，合格跟踪期间被引入到提高数据效率。在实验中，该算法的性能进行了论证，并与LSTD和ELM-LSTD相比。实验结果表明，该算法可以在一个固定的政策逼近值函数实现更稳定和更好的性能。

著录项

来源
《Chinese Control and Decision Conference》|2016年|6456-7086p|共6页
会议地点
作者
Dazi Li; Luntong Li; Tianheng Song; Qibing Jin;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP273-53;
关键词
Reinforcement learning; Markov decision processes; Function approximation; Least-squares temporal difference learning; Regularized Extreme learning machine;

机译：强化学习;马尔可夫决策过程;函数近似;最小二乘时间差异学习;正规化的极限学习机;

相似文献

外文文献
中文文献
专利

1. Least-squares temporal difference learning based on an extreme learning machine [J] . Pablo Escandell-Montero, Jose M. Martinez-Martinez, Jose D. Martin-Guerrero, Neurocomputing . 2014,第octa2期

机译：基于极限学习机的最小二乘时差学习
2. Discrimination of beta-thalassemia and iron deficiency anemia through extreme learning machine and regularized extreme learning machine based decision support system [J] . Medical hypotheses . 2020,第期

机译：基于极端学习机和正规化的极端学习机的决策支持系统辨别β-地中海贫血和缺铁性贫血
3. An efficient L2-norm regularized least-squares temporal difference learning algorithm [J] . Shenglei Chen, Geng Chen, Ruijun Gu Knowledge-Based Systems . 2013,第juna期

机译：一种有效的L2范数正则化最小二乘时差学习算法
4. Least-Squares Temporal Difference Learning with Eligibility Traces based on Regularized Extreme Learning Machine [C] . Dazi Li, Luntong Li, Tianheng Song, Chinese Control and Decision Conference . 2016

机译：基于正规化的极端学习机的资格迹线，最小二乘时间差异学习
5. Incremental least-squares temporal difference learning. [D] . Geramifard, Alborz. 2007

机译：增量最小二乘时差学习。
6. Correntropy induced loss based sparse robust graph regularized extreme learning machine for cancer classification [O] . Liang-Rui Ren, Ying-Lian Gao, Jin-Xing Liu, 2020

机译：基于管道诱导的损失稀疏鲁棒图正规化极端学习机用于癌症分类
7. Regularization and feature selection in least-squares temporal difference learning [O] . J. Zico Kolter, Andrew Y. Ng 2009

机译：最小二乘时差学习中的正则化和特征选择

Least-Squares Temporal Difference Learning with Eligibility Traces based on Regularized Extreme Learning Machine

摘要

著录项

相似文献

相关主题

期刊订阅