Stride 2 1-D, 2-D, and 3-D Winograd for Convolutional Neural Networks

Yepez Juan; Ko Seok-Bum

首页> 外文期刊>IEEE transactions on very large scale integration (VLSI) systems >Stride 2 1-D, 2-D, and 3-D Winograd for Convolutional Neural Networks

【24h】

Stride 2 1-D, 2-D, and 3-D Winograd for Convolutional Neural Networks

机译：用于卷积神经网络的步幅2 1-D，2-D和3-D WinoGrad

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Convolutional neural networks (CNNs) have been widely adopted for computer vision applications. CNNs require many multiplications, making their use expensive in terms of both computational complexity and hardware. An effective method to mitigate the number of required multiplications is via the Winograd algorithm. Previous implementations of CNNs based on Winograd use the 2-D algorithm F(2 x 2, 3 x 3), which reduces computational complexity by a factor of 2.25 over regular convolution. However, current Winograd implementations only apply when using a stride (shift displacement of a kernel over an input) of 1. In this article, we presented a novel method to apply the Winograd algorithm to a stride of 2. This method is valid for one, two, or three dimensions. We also introduced new Winograd versions compatible with a kernel of size 3, 5, and 7. The algorithms were successfully implemented on an NVIDIA K20c GPU. Compared to regular convolutions, the implementations for stride 2 are 1.44 times faster for a 3 x 3 kernel, 2.04x faster for a 5 x 5 kernel, 2.42x faster for a 7 x 7 kernel, and 1.73x faster for a 3 x 3 x 3 kernel. Additionally, a CNN accelerator using a novel processing element (PE) performs two 2-D Winograd stride 1, or one 2-D Winograd stride 2, and operations per clock cycle was implemented on an Intel Arria-10 field-programmable gate array (FPGA). We accelerated the original and our proposed modified VGG-16 architectures and achieved digital signal processor (DSP) efficiencies of 1.22 giga operations per second (GOPS)/DSPs and 1.33 GOPS/DSPs, respectively.

机译：卷积神经网络（CNNS）已被广泛采用计算机视觉应用。 CNNS需要多种乘法，以计算复杂性和硬件而言昂贵。通过WinoGrad算法缓解所需乘法数的有效方法。基于WinoGrad的CNN的先前实现使用了2-D算法F（2 x 2,3 x 3），其在常规卷积中通过2.25的计算复杂度降低了2.25倍。但是，当前的Winograd实现仅适用于使用步幅（在输入上的内核的移动位置）。在本文中，我们提出了一种将WinoGrad算法应用于2.此方法对一个新方法进行了新颖的方法，两个或三个维度。我们还推出了与大小3,5和7的内核兼容的新的Winograd版本。该算法在NVIDIA K20C GPU上成功实现。与常规卷积相比，对于3 x 3内核，步幅2的实现速度快1.44倍，对于5 x 5内核，2.04倍，对于7 x 7内核，2.42倍，对于3 x 3，1.73x更快1.73x x 3内核。另外，使用新颖的处理元件（PE）的CNN加速器执行两个2-D WinoGrad阶段1，或者一个2-D WinoGrad步阶2，并且每个时钟周期的操作在Intel Arria-10现场可编程门阵列上实现（ FPGA）。我们加速了原始的和我们提出的修改的VGG-16架构，并分别实现了每秒1.22千兆操作的数字信号处理器（DSP）效率（GOP）/ DSP和1.33 GOP / DSP。

著录项

来源
《IEEE transactions on very large scale integration (VLSI) systems》 |2020年第4期|853-863|共11页
作者
Yepez Juan; Ko Seok-Bum;
展开▼
作者单位

Univ Saskatchewan Elect & Comp Engn Dept Saskatoon SK S7N 5A9 Canada;

Univ Saskatchewan Elect & Comp Engn Dept Saskatoon SK S7N 5A9 Canada;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Convolutional neural network (CNN); deep neural network; field-programmable gate array (FPGA); stride; Winograd;

机译：卷积神经网络（CNN）;深神经网络;现场可编程门阵列（FPGA）;步幅;WINOGRAD;

相似文献

外文文献
中文文献
专利

1. Accurate Blind Lempel-Ziv-77 Parameter Estimation via 1-D to 2-D Data Conversion Over Convolutional Neural Network [J] . Kwon Beom, Song Hyewon, Lee Sanghoon Quality Control, Transactions . 2020,第期

机译：通过1-D通过1-D到卷积神经网络的2-D数据转换准确盲人LEMPEL-ZIV-77参数估计
2. Extending 2-D Convolutional Neural Networks to 3-D for Advancing Deep Learning Cancer Classification With Application to MRI Liver Tumor Differentiation [J] . Trivizakis Eleftherios, Manikis Georgios C., Nikiforaki Katerina, Biomedical and Health Informatics, IEEE Journal of . 2019,第3期

机译：将2-D卷积神经网络扩展到3-D以促进深度学习癌症分类并应用于MRI肝肿瘤鉴别
3. Fault Diagnosis of Rotating Machinery under Noisy Environment Conditions Based on a 1-D Convolutional Autoencoder and 1-D Convolutional Neural Network [J] . Xingchen Liu, Qicai Zhou, Jiong Zhao, Sensors . 2019,第4期

机译：基于一维卷积自动编码器和一维卷积神经网络的嘈杂环境下旋转机械故障诊断
4. Speech Emotion Recognition using Convolution Neural Networks and Deep Stride Convolutional Neural Networks [C] . Taiba Majid Wani, Teddy Surya Gunawan, Syed Asif Ahmad Qadri, International Conference on Wireless and Telematics . 2020

机译：使用卷积神经网络和深度跨步卷积神经网络的语音情感识别
5. Adaptive Stride Convolutional Neural Networks [D] . Seepun, Sarun. 2021

机译：自适应步伐卷积神经网络
6. Fault Diagnosis of Rotating Machinery under Noisy Environment Conditions Based on a 1-D Convolutional Autoencoder and 1-D Convolutional Neural Network [O] . Xingchen Liu, Qicai Zhou, Jiong Zhao, 2019

机译：基于一维卷积自动编码器和一维卷积神经网络的嘈杂环境下旋转机械故障诊断
7. Three Novel Cd(II) Metal-Organic Frameworks Constructed from Mixed Ligands of Dipyrido3,2-d:2,3-fquinoxaline and Benzene-dicarboxylate: From a 1-D Ribbon, 2-D Layered Network, to a 3-D Architecture [O] . -1

机译：三种新型CD（II）金属有机框架由二吡啶的混合配体构成3,2-D：2,3-F喹喔啉和苯二羧酸酯：来自1-D色带，2-D分层网络，到a三维建筑

Stride 2 1-D, 2-D, and 3-D Winograd for Convolutional Neural Networks

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅