首页> 外文期刊>PLoS Computational Biology >Grid Cells, Place Cells, and Geodesic Generalization for Spatial Reinforcement Learning
【24h】

Grid Cells, Place Cells, and Geodesic Generalization for Spatial Reinforcement Learning

机译:用于空间增强学习的网格单元,位置单元和测地线综合

获取原文
           

摘要

Reinforcement learning (RL) provides an influential characterization of the brain's mechanisms for learning to make advantageous choices. An important problem, though, is how complex tasks can be represented in a way that enables efficient learning. We consider this problem through the lens of spatial navigation, examining how two of the brain's location representations—hippocampal place cells and entorhinal grid cells—are adapted to serve as basis functions for approximating value over space for RL. Although much previous work has focused on these systems' roles in combining upstream sensory cues to track location, revisiting these representations with a focus on how they support this downstream decision function offers complementary insights into their characteristics. Rather than localization, the key problem in learning is generalization between past and present situations, which may not match perfectly. Accordingly, although neural populations collectively offer a precise representation of position, our simulations of navigational tasks verify the suggestion that RL gains efficiency from the more diffuse tuning of individual neurons, which allows learning about rewards to generalize over longer distances given fewer training experiences. However, work on generalization in RL suggests the underlying representation should respect the environment's layout. In particular, although it is often assumed that neurons track location in Euclidean coordinates (that a place cell's activity declines “as the crow flies” away from its peak), the relevant metric for value is geodesic: the distance along a path, around any obstacles. We formalize this intuition and present simulations showing how Euclidean, but not geodesic, representations can interfere with RL by generalizing inappropriately across barriers. Our proposal that place and grid responses should be modulated by geodesic distances suggests novel predictions about how obstacles should affect spatial firing fields, which provides a new viewpoint on data concerning both spatial codes.
机译:强化学习(RL)为学习做出有利选择的大脑机制提供了有影响的特征。但是,一个重要的问题是如何以有效的学习方式来表示复杂的任务。我们通过空间导航的角度来考虑这个问题,研究如何将大脑的两个位置表示(海马的位置细胞和内在的网格细胞)用作基本函数,以近似于RL的空间值。尽管以前的许多工作都集中在这些系统在组合上游感觉线索以跟踪位置方面的作用,但重新审视这些表示方式时,着重于它们如何支持此下游决策功能,从而提供了对其特征的补充见解。学习中的关键问题不是本地化,而是过去和现在情况之间的概括,这可能并不完全匹配。因此,尽管神经种群共同提供了精确的位置表示,但我们对导航任务的模拟证明了RL可通过对单个神经元进行更加分散的调整来提高效率,这使得在学习较少的训练经验的情况下,学习奖励可以推广更长的距离。但是,有关RL泛化的工作表明底层表示应尊重环境的布局。特别是,尽管通常假定神经元跟踪欧几里得坐标中的位置(位置细胞的活动“随着乌鸦飞离峰顶而下降”),但有关值的度量是测地线:沿路径的距离,在任何位置障碍。我们将这种直觉形式化,并提供模拟显示欧几里得表示,而不是测地线表示如何通过不恰当地跨障碍进行归纳来干扰RL。我们的建议是通过测地距离来调整位置和网格响应,这提出了有关障碍物如何影响空间发射场的新颖预测,这为有关两个空间码的数据提供了新的观点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号