首页> 外文期刊>Neural Networks: The Official Journal of the International Neural Network Society >Intrinsically motivated action-outcome learning and goal-based action recall: A system-level bio-constrained computational model
【24h】

Intrinsically motivated action-outcome learning and goal-based action recall: A system-level bio-constrained computational model

机译:内在动机的行动成果学习和基于目标的行动回忆:系统级的生物约束计算模型

获取原文
获取原文并翻译 | 示例
           

摘要

Reinforcement (trial-and-error) learning in animals is driven by a multitude of processes. Most animals have evolved several sophisticated systems of 'extrinsic motivations' (EMs) that guide them to acquire behaviours allowing them to maintain their bodies, defend against threat, and reproduce. Animals have also evolved various systems of 'intrinsic motivations' (IMs) that allow them to acquire actions in the absence of extrinsic rewards. These actions are used later to pursue such rewards when they become available. Intrinsic motivations have been studied in Psychology for many decades and their biological substrates are now being elucidated by neuroscientists. In the last two decades, investigators in computational modelling, robotics and machine learning have proposed various mechanisms that capture certain aspects of IMs. However, we still lack models of IMs that attempt to integrate all key aspects of intrinsically motivated learning and behaviour while taking into account the relevant neurobiological constraints. This paper proposes a bio-constrained system-level model that contributes a major step towards this integration. The model focusses on three processes related to IMs and on the neural mechanisms underlying them: (a) the acquisition of action-outcome associations (internal models of the agent-environment interaction) driven by phasic dopamine signals caused by sudden, unexpected changes in the environment; (b) the transient focussing of visual gaze and actions on salient portions of the environment; (c) the subsequent recall of actions to pursue extrinsic rewards based on goal-directed reactivation of the representations of their outcomes. The tests of the model, including a series of selective lesions, show how the focussing processes lead to a faster learning of action-outcome associations, and how these associations can be recruited for accomplishing goal-directed behaviours. The model, together with the background knowledge reviewed in the paper, represents a framework that can be used to guide the design and interpretation of empirical experiments on IMs, and to computationally validate and further develop theories on them.
机译:动物的强化(反复试验)学习是由许多过程驱动的。大多数动物都进化出了几种复杂的“外在动机”(EMs)系统,这些系统引导它们获得能够维持其身体,抵御威胁和繁殖的行为。动物还进化了各种“内在动机”(IM)系统,使它们能够在没有外在奖励的情况下获得行动。这些动作在以后可用时将用于追求这种奖励。内在动机已经在心理学中进行了数十年的研究,现在神经科学家正在阐明其内在动机。在过去的二十年中,计算建模,机器人技术和机器学习方面的研究人员提出了捕获IM某些方面的各种机制。但是,我们仍然缺乏IM的模型,这些模型试图将内在动机的学习和行为的所有关键方面整合在一起,同时考虑到相关的神经生物学约束。本文提出了一种生物受限的系统级模型,该模型为该集成迈出了重要一步。该模型着重于与IM相关的三个过程以及它们所基于的神经机制:(a)由多巴胺信号突然,意想不到的变化引起的多巴胺信号驱动的行动-结果关联(药物-环境相互作用的内部模型)的获得。环境; (b)将视线和动作暂时聚焦在环境的重要部分上; (c)随后基于目标导向的结果表示的重新激活而追回寻求外部奖励的行动。该模型的测试(包括一系列选择性病变)显示了聚焦过程如何导致更快地了解行动结果关联,以及如何招募这些关联以实现目标导向的行为。该模型与本文中回顾的背景知识一起,代表了一个框架,可用于指导IM的经验实验的设计和解释,并在计算上验证和进一步发展关于IM的理论。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号