【24h】

Goal-directed feature learning

机译:目标导向特征学习

获取原文

摘要

Only a subset of available sensory information is useful for decision making. Classical models of the brain's sensory system, such as generative models, consider all elements of the sensory stimuli. However, only the action-relevant components of stimuli need to reach the motor control and decision making structures in the brain. To learn these action-relevant stimuli, the part of the sensory system that feeds into a motor control circuit needs some kind of relevance feedback. We propose a simple network model consisting of a feature learning (sensory) layer that feeds into a reinforcement learning (action) layer. Feedback is established by the reinforcement learner's temporal difference (delta) term modulating an otherwise Hebbian-like learning rule of the feature learner. Under this influence, the feature learning network only learns the relevant features of the stimuli, i.e. those features on which goal-directed actions are to be based. With the input preprocessed in this manner, the reinforcement learner performs well in delayed reward tasks. The learning rule approximates an energy function's gradient descent. The model presents a link between reinforcement learning and unsupervised learning and may help to explain how the basal ganglia receive selective cortical input.
机译:只有一部分可用的感官信息可用于决策。大脑的感觉系统的经典模型(例如生成模型)考虑了感觉刺激的所有要素。但是,只有与刺激有关的动作成分才能到达大脑的运动控制和决策结构。为了学习这些与动作有关的刺激,感觉系统中馈入电动机控制电路的部分需要某种相关性反馈。我们提出了一个简单的网络模型,该模型由功能学习(感官)层组成,并馈入强化学习(动作)层。反馈是通过强化学习者的时间差异(delta)项来建立的,该术语调制特征学习者的其他类似Hebbian的学习规则。在这种影响下,特征学习网络仅学习刺激的相关特征,即目标定向动作所基于的那些特征。通过以这种方式对输入进行预处理,强化学习者在延迟的奖励任务中表现良好。学习规则近似于能量函数的梯度下降。该模型提出了强化学习与无监督学习之间的联系,并可能有助于解释基底神经节如何接收选择性皮质输入。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号