首页> 美国卫生研究院文献>Springer Open Choice >Path-finding in real and simulated rats: assessing the influence of path characteristics on navigation learning
【2h】

Path-finding in real and simulated rats: assessing the influence of path characteristics on navigation learning

机译:在真实和模拟大鼠中的寻路:评估路径特征对导航学习的影响

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

A large body of experimental evidence suggests that the hippocampal place field system is involved in reward based navigation learning in rodents. Reinforcement learning (RL) mechanisms have been used to model this, associating the state space in an RL-algorithm to the place-field map in a rat. The convergence properties of RL-algorithms are affected by the exploration patterns of the learner. Therefore, we first analyzed the path characteristics of freely exploring rats in a test arena. We found that straight path segments with mean length 23 cm up to a maximal length of 80 cm take up a significant proportion of the total paths. Thus, rat paths are biased as compared to random exploration. Next we designed a RL system that reproduces these specific path characteristics. Our model arena is covered by overlapping, probabilistically firing place fields (PF) of realistic size and coverage. Because convergence of RL-algorithms is also influenced by the state space characteristics, different PF-sizes and densities, leading to a different degree of overlap, were also investigated. The model rat learns finding a reward opposite to its starting point. We observed that the combination of biased straight exploration, overlapping coverage and probabilistic firing will strongly impair the convergence of learning. When the degree of randomness in the exploration is increased, convergence improves, but the distribution of straight path segments becomes unrealistic and paths become ‘wiggly’. To mend this situation without affecting the path characteristic two additional mechanisms are implemented: A gradual drop of the learned weights (weight decay) and path length limitation, which prevents learning if the reward is not found after some expected time. Both mechanisms limit the memory of the system and thereby counteract effects of getting trapped on a wrong path. When using these strategies individually divergent cases get substantially reduced and for some parameter settings no divergence was found anymore at all. Using weight decay and path length limitation at the same time, convergence is not much improved but instead time to convergence increases as the memory limiting effect is getting too strong. The degree of improvement relies also on the size and degree of overlap (coverage density) in the place field system. The used combination of these two parameters leads to a trade-off between convergence and speed to convergence. Thus, this study suggests that the role of the PF-system in navigation learning cannot be considered independently from the animals’ exploration pattern.
机译:大量的实验证据表明,海马场所场系统参与了啮齿动物基于奖励的导航学习。强化学习(RL)机制已被用于对此建模,将RL算法中的状态空间与大鼠中的场所图关联。 RL算法的收敛性受学习者探索模式的影响。因此,我们首先分析了在测试环境中自由探索大鼠的路径特征。我们发现平均长度为23厘米,最大长度为80厘米的直线路径段占总路径的很大比例。因此,与随机探索相比,大鼠路径有偏差。接下来,我们设计了一个RL系统,该系统可再现这些特定的路径特征。我们的模型竞技场被具有实际大小和覆盖范围的重叠概率发射场(PF)覆盖。由于RL算法的收敛性还受状态空间特性的影响,因此还研究了不同的PF大小和密度,从而导致了不同程度的重叠。模型老鼠学会寻找与起点相反的奖励。我们观察到,有偏向的直接探索,重叠的覆盖范围和概率激发的结合将极大地削弱学习的融合。当探索中的随机度增加时,会改善收敛性,但是直线路径段的分布变得不现实,路径会变得“摆动”。为了在不影响路径特征的情况下修复这种情况,还实现了两个附加的机制:学习权重的逐渐下降(权重衰减)和路径长度限制,如果在某个预期时间后找不到奖励,则会阻止学习。两种机制都限制了系统的内存,从而抵消了陷入错误路径的影响。当使用这些策略时,个别分歧的情况将大大减少,并且对于某些参数设置,根本找不到分歧。同时使用权重衰减和路径长度限制,收敛性不会得到很大改善,但是随着内存限制效果变得越来越强,收敛时间会增加。改善程度还取决于场所场系统的大小和重叠程度(覆盖密度)。这两个参数的使用组合导致收敛和收敛速度之间的权衡。因此,这项研究表明,不能独立于动物的探索模式来考虑PF系统在导航学习中的作用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号