首页> 美国卫生研究院文献>Springer Open Choice >Mathematical properties of neuronal TD-rules and differential Hebbian learning: a comparison
【2h】

Mathematical properties of neuronal TD-rules and differential Hebbian learning: a comparison

机译:神经元TD规则和差分Hebbian学习的数学特性:比较

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

A confusingly wide variety of temporally asymmetric learning rules exists related to reinforcement learning and/or to spike-timing dependent plasticity, many of which look exceedingly similar, while displaying strongly different behavior. These rules often find their use in control tasks, for example in robotics and for this rigorous convergence and numerical stability is required. The goal of this article is to review these rules and compare them to provide a better overview over their different properties. Two main classes will be discussed: temporal difference (TD) rules and correlation based (differential hebbian) rules and some transition cases. In general we will focus on neuronal implementations with changeable synaptic weights and a time-continuous representation of activity. In a machine learning (non-neuronal) context, for TD-learning a solid mathematical theory has existed since several years. This can partly be transfered to a neuronal framework, too. On the other hand, only now a more complete theory has also emerged for differential Hebb rules. In general rules differ by their convergence conditions and their numerical stability, which can lead to very undesirable behavior, when wanting to apply them. For TD, convergence can be enforced with a certain output condition assuring that the δ-error drops on average to zero (output control). Correlation based rules, on the other hand, converge when one input drops to zero (input control). Temporally asymmetric learning rules treat situations where incoming stimuli follow each other in time. Thus, it is necessary to remember the first stimulus to be able to relate it to the later occurring second one. To this end different types of so-called eligibility traces are being used by these two different types of rules. This aspect leads again to different properties of TD and differential Hebbian learning as discussed here. Thus, this paper, while also presenting several novel mathematical results, is mainly meant to provide a road map through the different neuronally emulated temporal asymmetrical learning rules and their behavior to provide some guidance for possible applications.
机译:存在各种各样的时间不对称学习规则,这些规则与强化学习和/或依赖于尖峰时序的可塑性有关,其中许多看上去极其相似,同时表现出截然不同的行为。这些规则经常在控制任务中使用,例如在机器人技术中,因此需要严格的收敛性和数值稳定性。本文的目的是审查这些规则并进行比较,以更好地概述它们的不同属性。将讨论两个主要类别:时间差异(TD)规则和基于相关性的(差异性hebbian)规则以及一些过渡情况。总的来说,我们将专注于神经元的实现,这些神经元具有可变的突触权重和时间连续表示的活动。在机器学习(非神经元)环境中,对于TD学习,已经存在了几年的坚实数学理论。这也可以部分转移到神经元框架中。另一方面,只有现在,关于微分赫布规则的理论才出现了。通常,规则因其收敛条件和数值稳定性而异,这在想要应用它们时会导致非常不理想的行为。对于TD,可以在一定的输出条件下强制执行收敛,以确保δ误差平均下降到零(输出控制)。另一方面,当一个输入降至零(输入控制)时,基于相关性的规则会收敛。暂时的非对称学习规则可处理传入刺激在时间上彼此跟随的情况。因此,有必要记住第一个刺激,使其能够与随后发生的第二个刺激相关联。为此,这两种不同类型的规则使用了不同类型的所谓资格跟踪。这方面再次导致了TD和差分Hebbian学习的不同属性,如此处所述。因此,本文虽然还提出了一些新颖的数学结果,但主要是为了通过不同的神经元模拟的时间不对称学习规则及其行为提供路线图,从而为可能的应用提供一些指导。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号