Exploiting Generalization in the Subspaces for Faster Model-Based Reinforcement Learning

Hashemzadeh Maryam; Hosseini Reshad; Ahmadabadi Majid Nili

首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Exploiting Generalization in the Subspaces for Faster Model-Based Reinforcement Learning

【24h】

Exploiting Generalization in the Subspaces for Faster Model-Based Reinforcement Learning

机译：利用子空间的泛化，以更快的基于模型的强化学习

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Due to the lack of enough generalization in the state space, common methods of reinforcement learning suffer from slow learning speed, especially in the early learning trials. This paper introduces a model-based method in discrete state spaces for increasing the learning speed in terms of required experiences (but not required computation time) by exploiting generalization in the experiences of the subspaces. A subspace is formed by choosing a subset of features in the original state representation. Generalization and faster learning in a subspace are due to many-to-one mapping of experiences from the state space to each state in the subspace. Nevertheless, due to inherent perceptual aliasing (PA) in the subspaces, the policy suggested by each subspace does not generally converge to the optimal policy. Our approach, called model-based learning with subspaces (MoBLeSs), calculates the confidence intervals of the estimated Q-values in the state space and in the subspaces. These confidence intervals are used in the decision-making, such that the agent benefits the mast from the possible generalization while avoiding from the detriment of the PA in the subspaces. The convergence of MoBLeS to the optimal policy is theoretically investigated. In addition, we show through several experiments that MoBLeS improves the learning speed in the early trials.

机译：由于国家空间中缺乏足够的概括，普遍的加固学习方法遭受了缓慢的学习速度，特别是在早期学习试验中。本文在离散状态空间中介绍了一种基于模型的方法，用于通过利用子空间的经验中的泛化来提高所需经验（但不是所需计算时间）的学习速度。通过在原始状态表示中选择特征子集来形成子空间。子空间中的概括和更快的学习是由于子空间中每个状态的状态空间的多对一映射。尽管如此，由于子空间中固有的感知别名（PA），每个子空间所建议的政策通常不会收敛到最佳政策。我们的方法称为基于模型的学习与子空间（Mobless），计算状态空间和子空间中估计的Q值的置信区间。这些置信区间用于决策，使得代理在可能的概括中受益于桅杆，同时从子空间中避免PA的损害。理论上调查了Moble对最佳政策的汇聚。此外，我们展示了几个实验，即摩尔在早期试验中提高了学习速度。

著录项

来源
《Neural Networks and Learning Systems, IEEE Transactions on》 |2019年第6期|1635-1650|共16页
作者
Hashemzadeh Maryam; Hosseini Reshad; Ahmadabadi Majid Nili;
展开▼
作者单位

Univ Tehran Coll Engn Sch Elect & Comp Engn Cognit Syst Lab IR-14399571 Tehran Iran|Univ Tehran Coll Engn Sch Elect & Comp Engn Computat Audiovis Lab IR-14399571 Tehran Iran;

Univ Tehran Coll Engn Sch Elect & Comp Engn Computat Audiovis Lab IR-14399571 Tehran Iran|Inst Res Fundamental Sci Sch Comp Sci IR-19538335 Tehran Iran;

Univ Tehran Coll Engn Sch Elect & Comp Engn Cognit Syst Lab IR-14399571 Tehran Iran;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Curse of dimensionality; generalization in subspaces; learning speed; reinforcement learning (RL); value interval estimation;

机译：维度诅咒;子空间的概括;学习速度;加固学习（RL）;价值间隔估计;

相似文献

外文文献
中文文献
专利

1. Exploiting Generalization in the Subspaces for Faster Model-Based Reinforcement Learning [J] . Hashemzadeh Maryam, Hosseini Reshad, Ahmadabadi Majid Nili Neural Networks and Learning Systems, IEEE Transactions on . 2019,第6期

机译：利用子空间中的泛化以更快地进行基于模型的强化学习
2. Clustering subspace generalization to obtain faster reinforcement learning [J] . Evolving Systems . 2020,第1期

机译：聚类子空间概括获取更快的加强学习
3. Multi-Agent Deep Reinforcement Learning-Based Algorithm For Fast Generalization On Routing Problems [J] . Ibraheem Barbahan, Vladimir Baikalov, Valeriy Vyatkin, Procedia Computer Science . 2021,第a期

机译：基于多功能深度加强学习算法，用于快速概括对路由问题的影响
4. Context-aware Dynamics Model for Generalization in Model-Based Reinforcement Learning [C] . Kimin Lee, Younggyo Seo, Seunghyun Lee, International Conference on Machine Learning . 2021

机译：基于模型的增强学习中的泛化的背景感知动力学模型
5. Perception-based generalization in model-based reinforcement learning. [D] . Leffler, Bethany R. 2009

机译：基于模型的强化学习中基于感知的泛化。
6. Computational Characteristics of the Striatal Dopamine System Described by Reinforcement Learning With Fast Generalization [O] . Yoshihisa Fujita, Sho Yagishita, Haruo Kasai, 2020

机译：快速泛化钢筋学习描述的纹状体多巴胺系统的计算特征
7. Computational characteristics of the striatal dopamine system described by reinforcement learning with fast generalization [O] . Yoshihisa Fujita, Sho Yagishita, Haruo Kasai, 2019

机译：快速泛化钢筋学习描述的纹状体多巴胺系统的计算特征

Exploiting Generalization in the Subspaces for Faster Model-Based Reinforcement Learning

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅