On the Complexity Reduction of Dense Layers from O(N²) to O(NlogN) with Cyclic Sparsely Connected Layers

机译：循环稀疏连接层将致密层从O（N ^{2 ）还原为O（NlogN）的复杂性}

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In deep neural networks (DNNs), model size is an important factor affecting performance, energy efficiency and scalability. Recent works on weight pruning have shown significant reduction in model size at the expense of irregularity in the DNN architecture, which necessitates additional indexing memory to address non-zero weights, thereby increasing chip size, energy consumption and delays. In this paper, we propose cyclic sparsely connected (CSC) layers, with a memory/computation complexity of O(NlogN), that can be used as an overlay for fully connected (FC) layers whose number of parameters,O(N²), can dominate the parameters of the entire DNN model. The CSC layers are composed of a few sequential layers, referred to as support layers, which result in full connectivity between the Inputs and Outputs of each CSC layer. We introduce an algorithm to train models with FC layers replaced with CSC layers in a bottom-up approach by incrementally increasing the CSC layers characteristics such as connectivity and number of synapses, to achieve the desired accuracy given a compression rate. One advantage of the CSC layers is that there will be no requirement for indexing the non-zero weights. Our experimental results using AlexNet on ImageNet and LeNet300100 on MNIST indicate that by substituting FC layers with CSC layers, we can achieve $10imes to~46imes$ compression within a margin of 2% accuracy loss, which is comparable to non-structural pruning methods. A scalable parallel hardware architecture to implement CSC layers, and an equivalent scalable parallel architecture to efficiently implement non-structurally pruned FC layers are designed and fully placed and routed on Artix -7 FPGA and ASIC 65nm CMOS technology for LeNet300100 model. The results indicate that the proposed CSC hardware outperforms the conventional non-structurally pruned architecture with an equal compression rate by $sim2imes$ in power, energy, area and resource utilization when running at the same frequency.

机译：在深度神经网络（DNN）中，模型大小是影响性能，能源效率和可伸缩性的重要因素。最近有关权重修剪的工作表明，以DNN架构中的不规则性为代价，极大地减少了模型大小，这需要额外的索引存储器来处理非零权重，从而增加了芯片大小，能耗和延迟。在本文中，我们提出了循环稀疏连接（CSC）层，其存储/计算复杂度为O（NlogN），可用作参数数量为O（N \ n）的完全连接（FC）层的覆盖^{2 \ n），可以控制整个DNN模型的参数。 CSC层由几个连续的层组成，称为支撑层，从而导致每个CSC层的输入和输出之间完全连接。我们通过自底向上的方法，通过递增地增加CSC层的特性（例如连接性和突触的数量）来引入一种算法来训练用FC层替换为CSC层的模型，从而在给定压缩率的情况下达到所需的精度。 CSC层的优点之一是，将不需要索引非零权重。我们使用ImageNet上的AlexNet和MNIST上的LeNet300100的实验结果表明，通过用CSC层替换FC层，我们可以在10％的压缩率下达到46倍的压缩率，而精度损失为2％，这与非结构修剪方法。在用于LeNet300100模型的Artix -7 FPGA和ASIC 65nm CMOS技术上设计并完全放置并布线，以实现CSC层的可伸缩并行硬件体系结构以及有效地实现非结构化修剪的FC层的等效可伸缩并行体系结构。结果表明，在相同频率下运行时，所建议的CSC硬件在功率，能量，面积和资源利用率方面均优于传统的非结构修剪架构，压缩率相同，仅为$ \ sim2 \\ times $。}

著录项

来源
《2019 56th ACM/IEEE Design Automation Conference》|2019年|1-6|共6页
会议地点 Las Vegas(US)
作者
Morteza Hosseini; Mark Horton; Hiren Paneliya; Uttej Kallakuri; Houman Homayoun; Tinoosh Mohsenin;
展开▼
作者单位

Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, MD, USA;

Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, MD, USA;

Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, MD, USA;

Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, MD, USA;

Department of Electrical and Computer Engineering, George Mason University, Fairfax, VA, USA;

Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, MD, USA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Synapses; Indexing; Computational modeling; Hardware; Memory management; Generators;

机译：突触;索引;计算建模;硬件;内存管理;生成器;;

相似文献

外文文献
中文文献
专利

1. Sparse and dense encoding in layered associative network of spiking neurons [J] . Ishibashi K, Hamaguchi K, Okada M Journal of the Physical Society of Japan . 2007,第12期

机译：尖峰神经元分层关联网络中的稀疏和密集编码
2. Exploiting Sparsity to Accelerate Fully Connected Layers of CNN-Based Applications on Mobile SoCs [J] . Xie Xinfeng, Du Dayou, Li Qian, ACM Transactions on Embedded Computing Systems . 2018,第2期

机译：利用稀疏性加速在移动SOC上的基于CNN的基于CNN的应用层的完全连接层
3. A Recovery Algorithm of Linear Complexity in the Time-Domain Layered Finite Element Reduction Recovery (LAFE-RR) Method for Large-Scale Electromagnetic Analysis of High-Speed ICs [J] . Gan H., Jiao D. IEEE Transactions on Advanced Packaging . 2008,第3期

机译：高速IC大规模电磁分析的时域分层有限元还原恢复法（LAFE-RR）中线性复杂度的恢复算法
4. On the Complexity Reduction of Dense Layers from O(N2) to O(NlogN) with Cyclic Sparsely Connected Layers [C] . Morteza Hosseini, Mark Horton, Hiren Paneliya, ACM/IEEE Design Automation Conference . 2019

机译：循环稀疏连接层对O（n 2 ）至O（nlogn）的致密层的复杂性降低
5. Complexity reduction in inter layer inter prediction in scalable high efficiency video coding. [D] . Gubbi Shivashankar Sastri, Karuna. 2014

机译：可伸缩高效视频编码中的层间帧间预测的复杂度降低。
6. Truncating a densely connected convolutional neural network with partial layer freezing and feature fusion for diagnosing COVID-19 from chest X-rays [O] . Francis Jesmar P. Montalbo 2021

机译：通过部分层冻结致密连接的卷积神经网络具有用于诊断Covid-19免受胸部X射线诊断的特征融合
7. Sparse and Dense Encoding in Layered Associative Network of Spiking Neurons [O] . Ishibashi, Kazuya, Hamaguchi, Kosuke, Okada, Masato 2007

机译：spiking中分层联想网络中的稀疏和密集编码神经元

On the Complexity Reduction of Dense Layers from O(N2) to O(NlogN) with Cyclic Sparsely Connected Layers

摘要

著录项

相似文献

相关主题

期刊订阅

On the Complexity Reduction of Dense Layers from O(N²) to O(NlogN) with Cyclic Sparsely Connected Layers