Predictive representations can link model-based reinforcement learning to model-free mechanisms

机译：预测表示可以将基于模型的强化学习与无模型机制联系起来

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Humans and animals are capable of evaluating actions by considering their long-run future rewards through a process described using model-based reinforcement learning (RL) algorithms. The mechanisms by which neural circuits perform the computations prescribed by model-based RL remain largely unknown; however, multiple lines of evidence suggest that neural circuits supporting model-based behavior are structurally homologous to and overlapping with those thought to carry out model-free temporal difference (TD) learning. Here, we lay out a family of approaches by which model-based computation may be built upon a core of TD learning. The foundation of this framework is the successor representation, a predictive state representation that, when combined with TD learning of value predictions, can produce a subset of the behaviors associated with model-based learning, while requiring less decision-time computation than dynamic programming. Using simulations, we delineate the precise behavioral capabilities enabled by evaluating actions using this approach, and compare them to those demonstrated by biological organisms. We then introduce two new algorithms that build upon the successor representation while progressively mitigating its limitations. Because this framework can account for the full range of observed putatively model-based behaviors while still utilizing a core TD framework, we suggest that it represents a neurally plausible family of mechanisms for model-based evaluation.

机译：通过基于模型的强化学习（RL）算法描述的过程，人类和动物能够通过考虑其长期的未来回报来评估行为。神经电路执行基于模型的RL所规定的计算的机制仍然未知。但是，有许多证据表明，支持基于模型的行为的神经回路在结构上与那些认为进行无模型时差（TD）学习的神经回路是同源的并且重叠。在这里，我们提出了一系列方法，可以在TD学习的核心上构建基于模型的计算。此框架的基础是后继表示形式，即预测状态表示形式，当与价值预测的TD学习结合使用时，可以产生与基于模型的学习相关联的行为的子集，同时比动态编程需要更少的决策时间计算。通过模拟，我们描述了通过使用这种方法评估动作而实现的精确行为能力，并将其与生物有机体所证明的行为能力进行了比较。然后，我们介绍两种基于继承表示的新算法，同时逐步减轻其局限性。因为此框架可以解释观察到的所有假定的基于模型的行为，同时仍使用核心TD框架，所以我们建议它代表了一种基于模型的评估的神经机制。

著录项

期刊名称 PLoS Computational Biology
作者
Evan M. Russek; Ida Momennejad; Matthew M. Botvinick; Samuel J. Gershman; Nathaniel D. Daw;
展开▼
作者单位

展开▼
年(卷),期 2018(13),9
年度 2018
页码 e1005768
总页数 35
原文格式 PDF
正文语种
中图分类生化遗传学;生化药理学;
关键词

相似文献

外文文献
中文文献
专利

1. Predictive representations can link model-based reinforcement learning to model-free mechanisms [J] . Evan M. Russek, Ida Momennejad, Matthew M. Botvinick, PLoS Computational Biology . 2017,第9期

机译：预测表示可以将基于模型的强化学习与无模型机制联系起来
2. The "Proactive" Model of Learning: Integrative Framework for Model-Free and Model-Based Reinforcement Learning Utilizing the Associative Learning-Based Proactive Brain Concept [J] . Zsuga Judit, Biro Klara, Papp Csaba, Behavioral neuroscience . 2016,第1期

机译：“主动”学习模型：利用基于联合学习的主动脑概念进行无模型和基于模型的强化学习的集成框架
3. Multifidelity Reinforcement Learning With Gaussian Processes: Model-Based and Model-Free Algorithms [J] . Suryan Varun, Gondhalekar Nahush, Tokekar Pratap IEEE Robotics & Automation Magazine . 2020,第2期

机译：高斯工艺的多程度强化学习：基于模型和无模型算法
4. EEG-based classification of learning strategies : Model-based and model-free reinforcement learning [C] . Dongjae Kim, Charles Weston, Sang Wan Lee 2018 6th International Conference on Brain-Computer Interface . 2018

机译：基于脑电图的学习策略分类：基于模型和无模型的强化学习
5. Understanding Model-Based Reinforcement Learning and its Application in Safe Reinforcement Learning [D] . Hu, Dingcheng . 2019

机译：了解基于模型的强化学习及其在安全强化学习中的应用
6. Variability in Dopamine Genes Dissociates Model-Based and Model-Free Reinforcement Learning [O] . Bradley B. Doll, Kevin G. Bath, Nathaniel D. Daw, 2016

机译：多巴胺基因的变异性会分离基于模型的和无模型的强化学习
7. Predictive representations can link model-based reinforcement learning to model-free mechanisms. [O] . Evan M Russek, Ida Momennejad, Matthew M Botvinick, 2017

机译：预测表示可以将基于模型的强化学习与无模型机制联系起来。

Predictive representations can link model-based reinforcement learning to model-free mechanisms

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅