首页> 外文会议>International joint conference on computational intelligence >Reinforcement Learning and Attractor Neural Network Models of Associative Learning
【24h】

Reinforcement Learning and Attractor Neural Network Models of Associative Learning

机译:联想学习的强化学习和吸引子神经网络模型

获取原文

摘要

Despite indisputable advances in reinforcement learning (RL) research, some cognitive and architectural challenges still remain. The primary source of challenges in the current conception of RL stems from the theory's way to define states. Whereas states under laboratory conditions are tractable (due to the Markov property), states in real-world RL are high-dimensional, continuous and partially observable. Hence, effective learning and generalization can be guaranteed if the subset of reward relevant dimensions were correctly identified for each state. Moreover, the computational discrepancy between model-free and model-based RL methods creates a stability-plasticity dilemma in terms of how to guide optimal decision-making control in case of interactive and competitive multiple systems, each of which implements different type of RL methods. By showing behavioral results of how human subjects flexibly define states in a reversal learning paradigm contrary to a simple RL model, we argue that these challenges can be met by infusing the RL framework as an algorithmic theory of human behavior with the strengths of the attractor framework at the level of neural implementation. Our position is supported by the hypothesis that 'attractor states' which are stable patterns of self-sustained and reverberating brain activity, are a manifestation of the collective dynamics of neuronal populations in the brain. With its capacity of pattern-completion along with the ability to link events in temporal order, an attractor network becomes relatively insensitive to noise allowing to account for sparse data which is characteristic to high-dimensional and continuous real-world RL.
机译:尽管在强化学习(RL)的研究进展争,一些认知和建筑挑战仍然存在。挑战在RL的电流概念的主要来源,从理论的方式来定义的状态造成的。而在实验室条件下的状态是容易处理的(由于马氏性),在现实世界RL状态是高维的,连续的和部分可观察。因此,如果奖励相关尺寸的所述子集被正确地鉴定每个状态有效的学习和泛化可以得到保证。此外,模型的自由和基于模型的RL方法之间的计算差异中创建的如何引导最优决策控制在交互式和竞争多个系统的情况下,术语稳定性塑性困境每个器具不同类型的RL方法的。通过展示的主题人类如何灵活定义学习模式违背了一个简单的RL模型的逆转状态行为的结果,我们认为,这些问题可以通过注入的RL框架作为人类行为与吸引架构的优势的算法理论得到满足在神经执行层面。我们的立场是假设,即“吸引国家”,这是自持续回荡大脑活动的模式稳定,在大脑中的神经元群体的集体动态的表现支持。其与在时间顺序的能力,链路事件沿着图案完成容量,一个吸引网络变得相对对噪声不敏感,允许帐户稀疏数据这是高维和连续的真实世界的RL特性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号