首页> 美国卫生研究院文献>other >Reinforcement learning and episodic memory in humans and animals: an integrative framework
【2h】

Reinforcement learning and episodic memory in humans and animals: an integrative framework

机译:人和动物的强化学习和情景记忆:一个综合框架

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We review the psychology and neuroscience of reinforcement learning (RL), which has witnessed significant progress in the last two decades, enabled by the comprehensive experimental study of simple learning and decision-making tasks. However, the simplicity of these tasks misses important aspects of reinforcement learning in the real world: (i) State spaces are high-dimensional, continuous, and partially observable; this implies that (ii) data are relatively sparse: indeed precisely the same situation may never be encountered twice; and also that (iii) rewards depend on long-term consequences of actions in ways that violate the classical assumptions that make RL tractable.A seemingly distinct challenge is that, cognitively, these theories have largely connected with procedural and semantic memory: how knowledge about action values or world models extracted gradually from many experiences can drive choice. This misses many aspects of memory related to traces of individual events, such as episodic memory. We suggest that these two gaps are related. In particular, the computational challenges can be dealt with, in part, by endowing RL systems with episodic memory, allowing them to (i) efficiently approximate value functions over complex state spaces, (ii) learn with very little data, and (iii) bridge long-term dependencies between actions and rewards. We review the computational theory underlying this proposal and the empirical evidence to support it. Our proposal suggests that the ubiquitous and diverse roles of memory in RL may function as part of an integrated learning system.
机译:我们回顾了强化学习(RL)的心理学和神经科学,在简单的学习和决策任务的综合实验研究的推动下,在过去的二十年中取得了长足的进步。但是,这些任务的简单性错过了现实世界中强化学习的重要方面:(i)状态空间是高维的,连续的且部分可观察的;这意味着(ii)数据相对稀疏:实际上,完全相同的情况可能永远不会遇到两次; (iii)奖励取决于行为的长期后果,而行为的长期后果违背了使RL易于处理的经典假设。似乎不同的挑战是,从认知上讲,这些理论在很大程度上与程序和语义记忆有关:从许多经验中逐步提取的行动价值或世界模型可以推动选择。这错过了与个别事件的痕迹有关的记忆的许多方面,例如情节记忆。我们建议这两个差距是相关的。尤其是,可以通过以下方式来部分解决计算难题:通过为RL系统赋予情节性记忆,使它们能够(i)在复杂状态空间上有效地近似值函数,(ii)用很少的数据学习,以及(iii)桥接行动和奖励之间的长期依赖关系。我们回顾了该提议所基于的计算理论以及支持它的经验证据。我们的建议表明,记忆在RL中的普遍性和多样化作用可能是集成学习系统的一部分。

著录项

  • 期刊名称 other
  • 作者单位
  • 年(卷),期 -1(68),-1
  • 年度 -1
  • 页码 101–128
  • 总页数 32
  • 原文格式 PDF
  • 正文语种
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号