首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Exploiting Generalization in the Subspaces for Faster Model-Based Reinforcement Learning
【24h】

Exploiting Generalization in the Subspaces for Faster Model-Based Reinforcement Learning

机译:利用子空间的泛化,以更快的基于模型的强化学习

获取原文
获取原文并翻译 | 示例

摘要

Due to the lack of enough generalization in the state space, common methods of reinforcement learning suffer from slow learning speed, especially in the early learning trials. This paper introduces a model-based method in discrete state spaces for increasing the learning speed in terms of required experiences (but not required computation time) by exploiting generalization in the experiences of the subspaces. A subspace is formed by choosing a subset of features in the original state representation. Generalization and faster learning in a subspace are due to many-to-one mapping of experiences from the state space to each state in the subspace. Nevertheless, due to inherent perceptual aliasing (PA) in the subspaces, the policy suggested by each subspace does not generally converge to the optimal policy. Our approach, called model-based learning with subspaces (MoBLeSs), calculates the confidence intervals of the estimated Q-values in the state space and in the subspaces. These confidence intervals are used in the decision-making, such that the agent benefits the mast from the possible generalization while avoiding from the detriment of the PA in the subspaces. The convergence of MoBLeS to the optimal policy is theoretically investigated. In addition, we show through several experiments that MoBLeS improves the learning speed in the early trials.
机译:由于国家空间中缺乏足够的概括,普遍的加固学习方法遭受了缓慢的学习速度,特别是在早期学习试验中。本文在离散状态空间中介绍了一种基于模型的方法,用于通过利用子空间的经验中的泛化来提高所需经验(但不是所需计算时间)的学习速度。通过在原始状态表示中选择特征子集来形成子空间。子空间中的概括和更快的学习是由于子空间中每个状态的状态空间的多对一映射。尽管如此,由于子空间中固有的感知别名(PA),每个子空间所建议的政策通常不会收敛到最佳政策。我们的方法称为基于模型的学习与子空间(Mobless),计算状态空间和子空间中估计的Q值的置信区间。这些置信区间用于决策,使得代理在可能的概括中受益于桅杆,同时从子空间中避免PA的损害。理论上调查了Moble对最佳政策的汇聚。此外,我们展示了几个实验,即摩尔在早期试验中提高了学习速度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号