...
首页> 外文期刊>Quality Control, Transactions >Unsupervised Conditional Reflex Learning Based on Convolutional Spiking Neural Network and Reward Modulation
【24h】

Unsupervised Conditional Reflex Learning Based on Convolutional Spiking Neural Network and Reward Modulation

机译:基于卷积尖锐神经网络的无监督条件反射学习和奖励调制

获取原文
获取原文并翻译 | 示例

摘要

Automatic decision and lane-keeping tasks are the most fundamental but important tasks in robot controlling researches. Concerned about the computing limitations of mobile robot platforms, an easily trainable method with low computational consumption and low latency is needed to solve this problem. An unsupervised conditional reflex learning network was proposed in this paper, which uses the conditional pattern to learn the conditional pattern and make decisions in an unsupervised manner. We used the convolutional spiking neural network to extract hidden features of road lanes, and then used the dopamine modulation mechanism to learn the decision-making information from the acquired features. In order to evaluate the quality of automatic decision-making models, two metrics were designed which are total deviation distance per second (TDDPS) and target achievement rate during training (TAR). In the process of training, neither labels were given to the convolutional spiking neural network, nor artificial decision-making information was assigned to the dopamine modulation layer. Simulation experiments showed that the proposed model has a state-of-the-art performance in a relatively complicated scenario and solves three limitations mentioned in previous works. Our work brought more biological inspiration into decision support systems, with the hope that the proposed method can promote the development of the bionic decision support system in hardware, especially in neuromorphic hardware.
机译:自动决策和车道保存任务是机器人控制研究中最基本而重要的任务。关切移动机器人平台的计算限制,需要一种易于计算消耗和低延迟的易培训方法来解决这个问题。本文提出了一种无监督的条件反射学习网络,它使用条件模式来学习条件模式并以无监督的方式做出决定。我们利用卷积尖峰神经网络提取道路通道的隐藏特征,然后使用多巴胺调制机制来学习来自所获取的功能的决策信息。为了评估自动决策模型的质量,设计了两个度量,它们是每秒总偏差距离(TDDPS)和训练期间的目标成就率(tar)。在训练过程中,没有给予卷积尖峰神经网络的标签,没有将人工决策信息分配给多巴胺调制层。仿真实验表明,所提出的模型在相对复杂的情况下具有最先进的性能,并解决了以前的作品中提到的三个限制。我们的作品将更多的生物启示带入决策支持系统,希望该方法可以促进硬件中仿生决策支持系统的发展,尤其是神经形态硬件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号