首页> 美国卫生研究院文献>eLife >Offline replay supports planning in human reinforcement learning
【2h】

Offline replay supports planning in human reinforcement learning

机译:离线重放支持人类强化学习中的计划

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Making decisions in sequentially structured tasks requires integrating distally acquired information. The extensive computational cost of such integration challenges planning methods that integrate online, at decision time. Furthermore, it remains unclear whether ‘offline’ integration during replay supports planning, and if so which memories should be replayed. Inspired by machine learning, we propose that (a) offline replay of trajectories facilitates integrating representations that guide decisions, and (b) unsigned prediction errors (uncertainty) trigger such integrative replay. We designed a 2-step revaluation task for fMRI, whereby participants needed to integrate changes in rewards with past knowledge to optimally replan decisions. As predicted, we found that (a) multi-voxel pattern evidence for off-task replay predicts subsequent replanning; (b) neural sensitivity to uncertainty predicts subsequent replay and replanning; (c) off-task hippocampus and anterior cingulate activity increase when revaluation is required. These findings elucidate how the brain leverages offline mechanisms in planning and goal-directed behavior under uncertainty.
机译:在顺序结构化的任务中进行决策需要整合远端获取的信息。这种集成的大量计算成本对在决策时在线集成的规划方法提出了挑战。此外,还不清楚重播过程中的“脱机”集成是否支持计划,以及是否支持重播哪些记忆。受机器学习的启发,我们建议(a)轨迹的离线重播有助于整合指导决策的表示形式,并且(b)无符号的预测错误(不确定性)会触发这种整体重播。我们为功能磁共振成像设计了两步重估任务,参与者需要将奖励的变化与过去的知识相结合,以最佳地重新制定决策。如预测的那样,我们发现(a)任务外重播的多体素模式证据可预测随后的重新计划; (b)对不确定性的神经敏感性可预测随后的重播和重新计划; (c)需要重估时,非任务性海马和前扣带回活动增加。这些发现阐明了大脑如何在不确定性下如何利用离线机制进行计划和目标导向的行为。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号