首页> 美国卫生研究院文献>eLife >Offline replay supports planning in human reinforcement learning

【2h】

Offline replay supports planning in human reinforcement learning

机译：离线重放支持人类强化学习中的计划

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Making decisions in sequentially structured tasks requires integrating distally acquired information. The extensive computational cost of such integration challenges planning methods that integrate online, at decision time. Furthermore, it remains unclear whether ‘offline’ integration during replay supports planning, and if so which memories should be replayed. Inspired by machine learning, we propose that (a) offline replay of trajectories facilitates integrating representations that guide decisions, and (b) unsigned prediction errors (uncertainty) trigger such integrative replay. We designed a 2-step revaluation task for fMRI, whereby participants needed to integrate changes in rewards with past knowledge to optimally replan decisions. As predicted, we found that (a) multi-voxel pattern evidence for off-task replay predicts subsequent replanning; (b) neural sensitivity to uncertainty predicts subsequent replay and replanning; (c) off-task hippocampus and anterior cingulate activity increase when revaluation is required. These findings elucidate how the brain leverages offline mechanisms in planning and goal-directed behavior under uncertainty.

机译：在顺序结构化的任务中进行决策需要整合远端获取的信息。这种集成的大量计算成本对在决策时在线集成的规划方法提出了挑战。此外，还不清楚重播过程中的“脱机”集成是否支持计划，以及是否支持重播哪些记忆。受机器学习的启发，我们建议（a）轨迹的离线重播有助于整合指导决策的表示形式，并且（b）无符号的预测错误（不确定性）会触发这种整体重播。我们为功能磁共振成像设计了两步重估任务，参与者需要将奖励的变化与过去的知识相结合，以最佳地重新制定决策。如预测的那样，我们发现（a）任务外重播的多体素模式证据可预测随后的重新计划；（b）对不确定性的神经敏感性可预测随后的重播和重新计划；（c）需要重估时，非任务性海马和前扣带回活动增加。这些发现阐明了大脑如何在不确定性下如何利用离线机制进行计划和目标导向的行为。

著录项

期刊名称 eLife
作者
Ida Momennejad; A Ross Otto; Nathaniel D Daw; Kenneth A Norman;
展开▼
作者单位

展开▼
年(卷),期 2015(7),
年度 2015
页码 e32548
总页数 25
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Offline replay supports planning in human reinforcement learning [J] . Ida Momennejad, A Ross Otto, Nathaniel D Daw, eLife journal . 2018 ,第11期

机译：离线重播支持人类强化学习中的计划
2. Offline replay supports planning in human reinforcement learning [J] . Ida Momennejad, A Ross Otto, Nathaniel D Daw, eLife journal . 2018 ,第december期

机译：离线重播支持人类强化学习中的计划
3. Reinforcement Learning with Experience Replay for Model-Free Humanoid Walking Optimization [J] . Pawel Wawrzynski International journal of humanoid robotics . 2014 ,第3期

机译：通过体验重播进行强化学习，实现无模型人形行走优化
4. Search on the Replay Buffer: Bridging Planning and Reinforcement Learning [C] . Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine Conference on Neural Information Processing Systems . 2020

机译：在重放缓冲区搜索：桥接规划和强化学习
5. Rhythmic Action Synchronizes Memory Replay During Reinforcement Learning [D] . Roumis, Demetris . 2020

机译：节奏行动在强化学习期间同步内存重播
6. Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor–Critic with Hindsight Experience Replay [O] . Evan Prianto, MyeongSeop Kim, Jae-Han Park, 2020

机译：使用深度加强学习的多臂操纵器的路径规划：软演员 - 与后敏感体验重播
7. Offline replay supports planning in human reinforcement learning [O] . Ida Momennejad, A Ross Otto, Nathaniel D Daw, 2018

机译：离线重播支持人类强化学习计划

Offline replay supports planning in human reinforcement learning

摘要

著录项

相似文献

相关主题

期刊订阅