首页> 美国卫生研究院文献>PLoS Clinical Trials >Dual Reward Prediction Components Yield Pavlovian Sign- and Goal-Tracking
【2h】

Dual Reward Prediction Components Yield Pavlovian Sign- and Goal-Tracking

机译:双重奖励预测成分可产生巴甫洛夫式符号和目标跟踪

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Reinforcement learning (RL) has become a dominant paradigm for understanding animal behaviors and neural correlates of decision-making, in part because of its ability to explain Pavlovian conditioned behaviors and the role of midbrain dopamine activity as reward prediction error (RPE). However, recent experimental findings indicate that dopamine activity, contrary to the RL hypothesis, may not signal RPE and differs based on the type of Pavlovian response (e.g. sign- and goal-tracking responses). In this study, we address this discrepancy by introducing a new neural correlate for learning reward predictions; the correlate is called “cue-evoked reward”. It refers to a recall of reward evoked by the cue that is learned through simple cue-reward associations. We introduce a temporal difference learning model, in which neural correlates of the cue itself and cue-evoked reward underlie learning of reward predictions. The animal's reward prediction supported by these two correlates is divided into sign and goal components respectively. We relate the sign and goal components to approach responses towards the cue (i.e. sign-tracking) and the food-tray (i.e. goal-tracking) respectively. We found a number of correspondences between simulated models and the experimental findings (i.e. behavior and neural responses). First, the development of modeled responses is consistent with those observed in the experimental task. Second, the model's RPEs were similar to dopamine activity in respective response groups. Finally, goal-tracking, but not sign-tracking, responses rapidly emerged when RPE was restored in the simulated models, similar to experiments with recovery from dopamine-antagonist. These results suggest two complementary neural correlates, corresponding to the cue and its evoked reward, form the basis for learning reward predictions in the sign- and goal-tracking rats.
机译:强化学习(RL)已成为理解动物行为和决策的神经相关性的主要范例,部分原因是因为它具有解释巴甫洛夫条件行为的能力以及中脑多巴胺活动作为奖励预测误差(RPE)的作用。但是,最近的实验发现表明,与RL假设相反,多巴胺活性可能并不表示RPE,并且根据巴甫洛夫反应的类型(例如,体征和目标追踪反应)而有所不同。在这项研究中,我们通过引入一种新的神经关联来学习奖励预测来解决这一差异。该关联者称为“提示诱发的奖励”。它是指通过简单的提示-奖励关联学习到的提示所引起的奖励的回忆。我们介绍了一个时差学习模型,其中提示本身与提示诱发的奖励的神经相关性构成了奖励预测的学习基础。这两个相关性支持的动物奖励预测分别分为符号和目标成分。我们将符号和目标组成部分联系起来,以分别对提示(即符号跟踪)和食品托盘(即目标跟踪)做出响应。我们发现了模拟模型与实验结果之间的许多对应关系(即行为和神经反应)。首先,建模响应的开发与实验任务中观察到的一致。其次,该模型的RPE与相应反应组中的多巴胺活性相似。最后,当在模拟模型中恢复RPE时,目标跟踪而非符号跟踪的响应迅速出现,类似于从多巴胺-拮抗剂中恢复的实验。这些结果表明,两个互补的神经相关性,分别对应于提示及其诱发的奖励,构成了在体征和目标追踪大鼠中学习奖励预测的基础。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号