首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Exploiting Generalization in the Subspaces for Faster Model-Based Reinforcement Learning
【24h】

Exploiting Generalization in the Subspaces for Faster Model-Based Reinforcement Learning

机译:利用子空间中的泛化以更快地进行基于模型的强化学习

获取原文
获取原文并翻译 | 示例

摘要

Due to the lack of enough generalization in the state space, common methods of reinforcement learning suffer from slow learning speed, especially in the early learning trials. This paper introduces a model-based method in discrete state spaces for increasing the learning speed in terms of required experiences (but not required computation time) by exploiting generalization in the experiences of the subspaces. A subspace is formed by choosing a subset of features in the original state representation. Generalization and faster learning in a subspace are due to many-to-one mapping of experiences from the state space to each state in the subspace. Nevertheless, due to inherent perceptual aliasing (PA) in the subspaces, the policy suggested by each subspace does not generally converge to the optimal policy. Our approach, called model-based learning with subspaces (MoBLeSs), calculates the confidence intervals of the estimated Q-values in the state space and in the subspaces. These confidence intervals are used in the decision-making, such that the agent benefits the mast from the possible generalization while avoiding from the detriment of the PA in the subspaces. The convergence of MoBLeS to the optimal policy is theoretically investigated. In addition, we show through several experiments that MoBLeS improves the learning speed in the early trials.
机译:由于状态空间缺乏足够的概括性,强化学习的常见方法会出现学习速度较慢的情况,特别是在早期学习试验中。本文介绍了一种在离散状态空间中基于模型的方法,该方法通过利用子空间经验的泛化来提高所需经验(而不是所需的计算时间)的学习速度。通过选择原始状态表示中的特征子集来形成子空间。子空间中的泛化和更快的学习归因于经验从状态空间到子空间中每个状态的多对一映射。但是,由于子空间中固有的感知混叠(PA),每个子空间建议的策略通常不会收敛到最佳策略。我们的方法称为基于模型的子空间学习(MoBLeSs),它计算状态空间和子空间中估计Q值的置信区间。在决策中使用这些置信区间,以便代理从可能的概括中受益,同时避免子空间中PA的损害。理论上研究了MoBLeS向最优策略的收敛。此外,我们通过几个实验表明MoBLeS在早期试验中提高了学习速度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号