首页> 美国卫生研究院文献>PLoS Computational Biology >Parallel Representation of Value-Based and Finite State-Based Strategies in the Ventral and Dorsal Striatum
【2h】

Parallel Representation of Value-Based and Finite State-Based Strategies in the Ventral and Dorsal Striatum

机译:腹侧和背侧纹状体中基于价值和有限状态的策略的并行表示

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Previous theoretical studies of animal and human behavioral learning have focused on the dichotomy of the value-based strategy using action value functions to predict rewards and the model-based strategy using internal models to predict environmental states. However, animals and humans often take simple procedural behaviors, such as the “win-stay, lose-switch” strategy without explicit prediction of rewards or states. Here we consider another strategy, the finite state-based strategy, in which a subject selects an action depending on its discrete internal state and updates the state depending on the action chosen and the reward outcome. By analyzing choice behavior of rats in a free-choice task, we found that the finite state-based strategy fitted their behavioral choices more accurately than value-based and model-based strategies did. When fitted models were run autonomously with the same task, only the finite state-based strategy could reproduce the key feature of choice sequences. Analyses of neural activity recorded from the dorsolateral striatum (DLS), the dorsomedial striatum (DMS), and the ventral striatum (VS) identified significant fractions of neurons in all three subareas for which activities were correlated with individual states of the finite state-based strategy. The signal of internal states at the time of choice was found in DMS, and for clusters of states was found in VS. In addition, action values and state values of the value-based strategy were encoded in DMS and VS, respectively. These results suggest that both the value-based strategy and the finite state-based strategy are implemented in the striatum.
机译:先前有关动物和人类行为学习的理论研究集中在使用行动价值函数预测奖励的基于价值的策略和使用内部模型预测环境状态的基于模型的策略的二分法。但是,动物和人类通常会采取简单的程序性行为,例如“胜任,失败”策略,而没有明确预测奖励或状态。在这里,我们考虑另一种策略,即基于状态的有限策略,其中,主体根据其离散的内部状态选择一个动作,并根据所选的动作和奖励结果来更新状态。通过分析自由选择任务中大鼠的选择行为,我们发现基于有限状态的策略比基于价值和基于模型的策略更准确地适应了他们的行为选择。当拟合模型自动执行相同任务时,只有基于有限状态的策略才能重现选择序列的关键特征。从背外侧纹状体(DLS),背侧纹状体(DMS)和腹侧纹状体(VS)记录的神经活动分析确定了所有三个分区中神经元的重要部分,这些活动的活动与基于有限状态的各个状态相关战略。在DMS中找到了选择时的内部状态信号,而在VS中找到了状态簇。此外,基于价值的策略的动作值和状态值分别在DMS和VS中进行编码。这些结果表明在纹状体中都实现了基于价值的策略和基于有限状态的策略。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号