首页> 外文会议> >Trading off perception with internal state: reinforcement learning and analysis of Q-Elman networks in a Markovian task
【24h】

Trading off perception with internal state: reinforcement learning and analysis of Q-Elman networks in a Markovian task

机译:权衡内部状态的感知:在马尔可夫任务中强化学习和Q-Elman网络分析

获取原文

摘要

A Markovian reinforcement learning task can be dealt with by learning a direct mapping from states to actions or values, or from state-action pairs to values. However, this may involve a difficult pattern recognition problem when the state space is large. This paper shows that using internal state, called "supportive state", may alleviate this problem presenting an argument against the tendency to almost automatically use a direct mapping when the task is Markovian. This point is demonstrated in simulation experiments of an agent controlled by a neural network capable of learning the strategy of direct mapping as well as internal state, combining Q(/spl lambda/) learning and recurrent neural networks in a new way. The trade-off between the two strategies is investigated in more detail, focusing particularly on border cases.
机译:马尔可夫强化学习任务可以通过学习从状态到动作或值或从状态-动作对到值的直接映射来处理。但是,当状态空间很大时,这可能会涉及一个困难的模式识别问题。本文显示,使用内部状态(称为“支持状态”)可以缓解此问题,并提出反对在任务为Markovian时几乎自动使用直接映射的趋势的论点。在由神经网络控制的代理的模拟实验中证明了这一点,该代理能够学习直接映射策略以及内部状态,并以新的方式结合了Q(/ spl lambda /)学习和递归神经网络。对两种策略之间的权衡进行了更详细的研究,特别是针对边境案件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号