首页> 美国卫生研究院文献>The Journal of Neuroscience >Human Dorsal Striatal Activity during Choice Discriminates Reinforcement Learning Behavior from the Gamblers Fallacy
【2h】

Human Dorsal Striatal Activity during Choice Discriminates Reinforcement Learning Behavior from the Gamblers Fallacy

机译:选择期间的人类背侧纹状活动将强化学习行为与赌徒的谬论区分开来

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Reinforcement learning theory has generated substantial interest in neurobiology, particularly because of the resemblance between phasic dopamine and reward prediction errors. Actor–critic theories have been adapted to account for the functions of the striatum, with parts of the dorsal striatum equated to the actor. Here, we specifically test whether the human dorsal striatum—as predicted by an actor–critic instantiation—is used on a trial-to-trial basis at the time of choice to choose in accordance with reinforcement learning theory, as opposed to a competing strategy: the gambler's fallacy. Using a partial-brain functional magnetic resonance imaging scanning protocol focused on the striatum and other ventral brain areas, we found that the dorsal striatum is more active when choosing consistent with reinforcement learning compared with the competing strategy. Moreover, an overlapping area of dorsal striatum along with the ventral striatum was found to be correlated with reward prediction errors at the time of outcome, as predicted by the actor–critic framework. These findings suggest that the same region of dorsal striatum involved in learning stimulus–response associations may contribute to the control of behavior during choice, thereby using those learned associations. Intriguingly, neither reinforcement learning nor the gambler's fallacy conformed to the optimal choice strategy on the specific decision-making task we used. Thus, the dorsal striatum may contribute to the control of behavior according to reinforcement learning even when the prescriptions of such an algorithm are suboptimal in terms of maximizing future rewards.
机译:强化学习理论引起了人们对神经生物学的浓厚兴趣,特别是由于相位多巴胺与奖励预测错误之间的相似性。行为者批判理论已被调整以说明纹状体的功能,而背侧纹状体的一部分等同于行为者。在这里,我们专门测试了在选择时根据强化学习理论(相对于竞争策略)进行选择时,是否按试行法使用了人类背侧纹状体(由演员评论实例化预测) :赌徒的谬误。使用侧脑纹状体和其他腹侧脑区域的部分脑功能磁共振成像扫描协议,我们发现,与竞争策略相比,选择强化学习时背侧纹状体更为活跃。此外,如行为者-批评框架所预测的那样,发现背侧纹状体与腹侧纹状体的重叠区域与结局时奖励预测误差相关。这些发现表明,与学习刺激-反应关联相关的同一区域的背侧纹状体可能有助于选择过程中的行为控制,从而利用这些学习的关联。有趣的是,强化学习和赌徒的谬论都不符合我们所使用的特定决策任务上的最佳选择策略。因此,即使这种算法的处方在最大化未来回报方面不是最佳的,背面纹状体也可能有助于根据强化学习来控制行为。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号