...
首页> 外文期刊>Frontiers in Neuroscience >A Flexible Mechanism of Rule Selection Enables Rapid Feature-Based Reinforcement Learning
【24h】

A Flexible Mechanism of Rule Selection Enables Rapid Feature-Based Reinforcement Learning

机译:灵活的规则选择机制可实现基于特征的快速强化学习

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Learning in a new environment is influenced by prior learning and experience. Correctly applying a rule that maps a context to stimuli, actions, and outcomes enables faster learning and better outcomes compared to relying on strategies for learning that are ignorant of task structure. However, it is often difficult to know when and how to apply learned rules in new contexts. In our study we explored how subjects employ different strategies for learning the relationship between stimulus features and positive outcomes in a probabilistic task context. We test the hypothesis that task naive subjects will show enhanced learning of feature specific reward associations by switching to the use of an abstract rule that associates stimuli by feature type and restricts selections to that dimension. To test this hypothesis we designed a decision making task where subjects receive probabilistic feedback following choices between pairs of stimuli. In the task, trials are grouped in two contexts by blocks, where in one type of block there is no unique relationship between a specific feature dimension (stimulus shape or color) and positive outcomes, and following an un-cued transition, alternating blocks have outcomes that are linked to either stimulus shape or color. Two-thirds of subjects ( n = 22/32) exhibited behavior that was best fit by a hierarchical feature-rule model. Supporting the prediction of the model mechanism these subjects showed significantly enhanced performance in feature-reward blocks, and rapidly switched their choice strategy to using abstract feature rules when reward contingencies changed. Choice behavior of other subjects ( n = 10/32) was fit by a range of alternative reinforcement learning models representing strategies that do not benefit from applying previously learned rules. In summary, these results show that untrained subjects are capable of flexibly shifting between behavioral rules by leveraging simple model-free reinforcement learning and context-specific selections to drive responses.
机译:在新环境中学习会受到先前学习和经验的影响。与依赖于对任务结构无知的学习策略相比,正确地应用将上下文映射到刺激,动作和结果的规则可以更快地学习并获得更好的结果。但是,通常很难知道何时以及如何在新的上下文中应用学习的规则。在我们的研究中,我们探索了受试者如何在概率性任务环境中采用不同的策略来学习刺激特征与积极结果之间的关系。我们测试了这样一个假设:通过切换到使用抽象规则(该规则将特征类型与刺激相关联并将选择限制于该维度),幼稚的任务主体将显示出对特定特征的奖励关联的增强学习。为了验证这一假设,我们设计了一项决策任务,使受试者根据刺激对之间的选择接受概率反馈。在任务中,试验按块在两种情况下分组,在一种类型的块中,特定特征尺寸(刺激形状或颜色)与阳性结果之间没有唯一的关系,并且在无提示的过渡之后,交替的块具有与刺激形状或颜色相关的结果。三分之二的受试者(n = 22/32)表现出的行为最适合分层特征规则模型。支持模型机制的预测的这些受试者在特征奖励块中表现出显着增强的性能,并在奖励突发事件发生变化时迅速将其选择策略切换为使用抽象特征规则。其他科目(n = 10/32)的选择行为符合一系列替代强化学习模型,这些模型表示无法从应用先前学习的规则中受益的策略。总而言之,这些结果表明,未经训练的受试者可以利用简单的无模型强化学习和特定于上下文的选择来驱动响应,从而在行为规则之间灵活地转换。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号