A Flexible Mechanism of Rule Selection Enables Rapid Feature-Based Reinforcement Learning

Matthew Balcarras; Thilo Womelsdorf

首页> 外文期刊>Frontiers in Neuroscience >A Flexible Mechanism of Rule Selection Enables Rapid Feature-Based Reinforcement Learning

【24h】

A Flexible Mechanism of Rule Selection Enables Rapid Feature-Based Reinforcement Learning

机译：灵活的规则选择机制可实现基于特征的快速强化学习

获取原文

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Learning in a new environment is influenced by prior learning and experience. Correctly applying a rule that maps a context to stimuli, actions, and outcomes enables faster learning and better outcomes compared to relying on strategies for learning that are ignorant of task structure. However, it is often difficult to know when and how to apply learned rules in new contexts. In our study we explored how subjects employ different strategies for learning the relationship between stimulus features and positive outcomes in a probabilistic task context. We test the hypothesis that task naive subjects will show enhanced learning of feature specific reward associations by switching to the use of an abstract rule that associates stimuli by feature type and restricts selections to that dimension. To test this hypothesis we designed a decision making task where subjects receive probabilistic feedback following choices between pairs of stimuli. In the task, trials are grouped in two contexts by blocks, where in one type of block there is no unique relationship between a specific feature dimension (stimulus shape or color) and positive outcomes, and following an un-cued transition, alternating blocks have outcomes that are linked to either stimulus shape or color. Two-thirds of subjects ( n = 22/32) exhibited behavior that was best fit by a hierarchical feature-rule model. Supporting the prediction of the model mechanism these subjects showed significantly enhanced performance in feature-reward blocks, and rapidly switched their choice strategy to using abstract feature rules when reward contingencies changed. Choice behavior of other subjects ( n = 10/32) was fit by a range of alternative reinforcement learning models representing strategies that do not benefit from applying previously learned rules. In summary, these results show that untrained subjects are capable of flexibly shifting between behavioral rules by leveraging simple model-free reinforcement learning and context-specific selections to drive responses.

机译：在新环境中学习会受到先前学习和经验的影响。与依赖于对任务结构无知的学习策略相比，正确地应用将上下文映射到刺激，动作和结果的规则可以更快地学习并获得更好的结果。但是，通常很难知道何时以及如何在新的上下文中应用学习的规则。在我们的研究中，我们探索了受试者如何在概率性任务环境中采用不同的策略来学习刺激特征与积极结果之间的关系。我们测试了这样一个假设：通过切换到使用抽象规则（该规则将特征类型与刺激相关联并将选择限制于该维度），幼稚的任务主体将显示出对特定特征的奖励关联的增强学习。为了验证这一假设，我们设计了一项决策任务，使受试者根据刺激对之间的选择接受概率反馈。在任务中，试验按块在两种情况下分组，在一种类型的块中，特定特征尺寸（刺激形状或颜色）与阳性结果之间没有唯一的关系，并且在无提示的过渡之后，交替的块具有与刺激形状或颜色相关的结果。三分之二的受试者（n = 22/32）表现出的行为最适合分层特征规则模型。支持模型机制的预测的这些受试者在特征奖励块中表现出显着增强的性能，并在奖励突发事件发生变化时迅速将其选择策略切换为使用抽象特征规则。其他科目（n = 10/32）的选择行为符合一系列替代强化学习模型，这些模型表示无法从应用先前学习的规则中受益的策略。总而言之，这些结果表明，未经训练的受试者可以利用简单的无模型强化学习和特定于上下文的选择来驱动响应，从而在行为规则之间灵活地转换。

著录项

来源
《Frontiers in Neuroscience》 |2016年第2009期|共12页
作者
Matthew Balcarras; Thilo Womelsdorf;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类神经病学与精神病学;
关键词
value-based decision makingreinforcement learningrule selectionmodel-freecognitive flexibility;

机译：基于价值的决策强化学习规则选择无模型的认知灵活性;

相似文献

外文文献
中文文献
专利

1. Energy-Efficient Mode Selection and Resource Allocation for D2D-Enabled Heterogeneous Networks: A Deep Reinforcement Learning Approach [J] . Zhang Tao, Zhu Kun, Wang Junhua IEEE transactions on wireless communications . 2021,第2期

机译：能源有效的D2D异构网络选择和资源分配：深度加强学习方法
2. A Deep Reinforcement Learning-Based Transcoder Selection Framework for Blockchain-Enabled Wireless D2D Transcoding [J] . Liu Mengting, Teng Yinglei, Yu F. Richard, IEEE Transactions on Communications . 2020,第6期

机译：基于深度加强学习的基于Scround的无线D2D转码的基于转换基督转换器选择框架
3. Rapid trajectory design in complex environments enabled by reinforcement learning and graph search strategies [J] . Acta astronautica . 2020,第Juna期

机译：通过强化学习和图搜索策略实现复杂环境中的快速轨迹设计
4. RAPID TRAJECTORY DESIGN IN COMPLEX ENVIRONMENTS ENABLED VIA SUPERVISED AND REINFORCEMENT LEARNING STRATEGIES [C] . Das-Stuart A, Howell K.C, Folta D International Astronautical Congress . 2019

机译：通过监督和强化学习策略启用复杂环境中的快速轨迹设计
5. Feature-based local policy reinforcement learning. [D] . Feltenberger, David. 2009

机译：基于特征的地方政策强化学习。
6. A Flexible Mechanism of Rule Selection Enables Rapid Feature-Based Reinforcement Learning [O] . Matthew Balcarras, Thilo Womelsdorf 2016

机译：灵活的规则选择机制可实现基于特征的快速强化学习
7. A flexible mechanism of rule selection enables rapid feature-based reinforcement learning [O] . Matthew eBalcarras, Thilo eWomelsdorf 2016

机译：灵活的规则选择机制可实现基于特征的快速强化学习

A Flexible Mechanism of Rule Selection Enables Rapid Feature-Based Reinforcement Learning

摘要

著录项

相似文献

相关主题

期刊订阅