Enhancing Utilization of SIMD-Like Accelerator for Sparse Convolutional Neural Networks

Lai Bo-Cheng; Pan Jyun-Wei; Lin Chien-Yu

首页> 外文期刊>IEEE transactions on very large scale integration (VLSI) systems >Enhancing Utilization of SIMD-Like Accelerator for Sparse Convolutional Neural Networks

【24h】

Enhancing Utilization of SIMD-Like Accelerator for Sparse Convolutional Neural Networks

机译：加强利用SIMD加速器的稀疏卷积神经网络

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Although the existing single-instruction-multiple-data-like (SIMD) accelerators can handle the compressed format of sparse convolutional neural networks, the sparse and irregular distributions of nonzero elements cause low utilization of multipliers in a processing engine (PE) and imbalanced computation between PEs. This brief addresses the above issues by proposing a data screening and task mapping (DSTM) accelerator which integrates a series of techniques, including software refinement and hardware modules. An efficient indexing module is introduced to identify the effectual computation pairs and skip unnecessary computation in a fine-grained manner. The intra-PE load imbalance is alleviated with weight data rearrangement. An effective task sharing mechanism further balances the computation between PEs. When compared with the state-of-the-art SIMD-like accelerator, the proposed DSTM enhances the average PE utilization by 3.5x. The overall processing throughput is 59.7% higher than the previous design.

机译：虽然现有的单指令 - 多数据（SIMD）加速器可以处理稀疏卷积神经网络的压缩格式，但非零元件的稀疏和不规则分布导致处理引擎（PE）和不平衡计算中的乘法器利用率低PE之间。本简要通过提出数据筛选和任务映射（DSTM）加速器来解决上述问题，该加速器集成了一系列技术，包括软件细化和硬件模块。引入有效的索引模块以识别有效的计算对并以细粒的方式跳过不必要的计算。通过重量数据重新排列，减轻了体内负载不平衡。有效的任务共享机制进一步平衡了PE之间的计算。与最先进的SIMD加速器相比，所提出的DSTM增强了3.5倍的平均PE利用率。总处理吞吐量比以前的设计高59.7％。

著录项

来源
《IEEE transactions on very large scale integration (VLSI) systems》 |2019年第5期|1218-1222|共5页
作者
Lai Bo-Cheng; Pan Jyun-Wei; Lin Chien-Yu;
展开▼
作者单位

Natl Chiao Tung Univ Inst Elect Hsinchu 30010 Taiwan;

Natl Chiao Tung Univ Inst Elect Hsinchu 30010 Taiwan;

Univ Washington Dept Comp Sci & Engn Seattle WA 98195 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Load balance; machine learning; single-instruction-multiple-data (SIMD) architecture; sparse convolutional neural networks (CNNs);

机译：负载平衡;机器学习;单指令 - 多数据（SIMD）架构;稀疏卷积神经网络（CNNS）;

相似文献

外文文献
中文文献
专利

1. Enhancing Utilization of SIMD-Like Accelerator for Sparse Convolutional Neural Networks [J] . Lai Bo-Cheng, Pan Jyun-Wei, Lin Chien-Yu IEEE transactions on very large scale integration (VLSI) systems . 2019,第5期

机译：类似于SIMD的加速器在稀疏卷积神经网络中的利用
2. WinoNN: Optimizing FPGA-Based Convolutional Neural Network Accelerators Using Sparse Winograd Algorithm [J] . Wang Xuan, Wang Chao, Cao Jing, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems . 2020,第11期

机译：WINONN：使用稀疏Winograd算法优化基于FPGA的卷积神经网络加速器
3. SENTEI: Filter-Wise Pruning with Distillation towards Efficient Sparse Convolutional Neural Network Accelerators [J] . Masayuki SHIMODA, Youki SADA, Ryosuke KURAMOCHI, IEICE transactions on information and systems . 2020,第12期

机译：Sentei：通过蒸馏到有效的稀疏卷积神经网络加速器的滤波器 - 明智的修剪
4. Supporting compressed-sparse activations and weights on SIMD-like accelerator for sparse convolutional neural networks [C] . Chien-Yu Lin, Bo-Cheng Lai Asia and South Pacific Design Automation Conference . 2018

机译：在稀疏卷积神经网络的类SIMD加速器上支持压缩稀疏激活和权重
5. FPGA-based Accelerators for Convolutional Neural Networks on Embedded Devices [D] . Perera Miro, Jordi. 2020

机译：基于FPGA的嵌入式设备卷积神经网络的加速器
6. Sparse Convolutional Neural Networks for Genome-Wide Prediction [O] . Patrik Waldmann, Christina Pfeiffer, Gábor Mészáros 2020

机译：全基因组预测的稀疏卷积神经网络
7. Improving Memory Utilization in Convolutional Neural Network Accelerators [O] . Petar Jokic, Stephane Emery, Luca Benini 2020

机译：提高卷积神经网络加速器中的内存利用

Enhancing Utilization of SIMD-Like Accelerator for Sparse Convolutional Neural Networks

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅