Code-specific policy gradient rules for spiking neurons

机译：尖峰神经元的特定于代码的策略梯度规则

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Although it is widely believed that reinforcement learning is a suitable tool for describing behavioral learning, the mechanisms by which it can be implemented in networks of spiking neurons are not fully understood. Here, we show that different learning rules emerge from a policy gradient approach depending on which features of the spike trains are assumed to influence the reward signals, i.e., depending on which neural code is in effect. We use the framework of Williams (1992) to derive learning rules for arbitrary neural codes. For illustration, we present policy-gradient rules for three different example codes - a spike count code, a spike timing code and the most general "full spike train" code - and test them on simple model problems. In addition to classical synaptic learning, we derive learning rules for intrinsic parameters that control the excitability of the neuron. The spike count learning rule has structural similarities with established Bienenstock-Cooper-Munro rules. If the distribution of the relevant spike train features belongs to the natural exponential family, the learning rules have a characteristic shape that raises interesting prediction problems.

机译：尽管人们普遍认为强化学习是描述行为学习的一种合适工具，但是在尖峰神经元网络中实现强化学习的机制仍未得到充分理解。在这里，我们显示出不同的学习规则会从策略梯度方法中产生出来，这取决于尖峰火车的哪些功能被认为会影响奖励信号，即取决于哪个神经代码有效。我们使用Williams（1992）的框架来推导任意神经代码的学习规则。为了说明，我们为三个不同的示例代码（尖峰计数代码，尖峰定时代码和最通用的“全尖峰序列”代码）提供了策略梯度规则，并在简单的模型问题上对其进行了测试。除了经典的突触学习，我们还为控制神经元兴奋性的内在参数导出了学习规则。峰值计数学习规则与已建立的Bienenstock-Cooper-Munro规则在结构上相似。如果相关峰值序列特征的分布属于自然指数族，则学习规则的特征形状会引起有趣的预测问题。

著录项

来源
《Conference on Neural Information Processing Systems;Annual conference on Neural Information Processing Systems》|2009年|P.1741-1749|共9页
会议地点
作者
Henning Sprekeler; Guillaume Hennequin; Wulfram Gerstner;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. CODE-SPECIFIC LEARNING RULES IMPROVE ACTION SELECTION BY POPULATIONS OF SPIKING NEURONS [J] . JOHANNES FRIEDRICH, ROBERT URBANCZIK, WALTER SENN International Journal of Neural Systems . 2014,第5期

机译：特定于代码的学习规则可改善抽动神经细胞的选择
2. Policy gradient rules for populations of spiking neurons [J] . Johannes Friedrich, Robert Urbanczik, Walter Senn BMC Neuroscience . 2011,第SUPPLEMENTa1期

机译：尖峰神经元种群的政策梯度规则
3. A gradient descent rule for spiking neurons emitting multiple spikes [J] . Olaf Booij, Hieu tat Nguyen Information Processing Letters . 2005,第6期

机译：尖峰神经元发出多个尖峰的梯度下降规则
4. Learning First-to-Spike Policies for Neuromorphic Control Using Policy Gradients [C] . Bleema Rosenfeld, Osvaldo Simeone, Bipin Rajendran International Workshop on Signal Processing Advances in Wireless Communications . 2019

机译：使用策略梯度学习神经形态控制的先到先得策略
5. Information Representation and Computation of Spike Trains in Reservoir Computing Systems with Spiking Neurons and Analog Neurons [D] . Almassian, Amin 2016

机译：尖峰神经元和类比神经元的水库计算系统中峰值列车的信息表示和计算
6. Policy gradient rules for populations of spiking neurons [O] . Johannes Friedrich, Robert Urbanczik, Walter Senn 2011

机译：尖峰神经元种群的策略梯度规则
7. Policy gradient rules for populations of spiking neurons [O] . Johannes Friedrich, Robert Urbanczik, Walter Senn 2011

机译：尖峰神经元种群的策略梯度规则

Code-specific policy gradient rules for spiking neurons

摘要

著录项

相似文献

相关主题

期刊订阅