Speeding up inference on deep neural networks for object detection by performing partial convolution

Kurdthongmee Wattanapong

首页> 外文期刊>Journal of Real-Time Image Processing >Speeding up inference on deep neural networks for object detection by performing partial convolution

【24h】

Speeding up inference on deep neural networks for object detection by performing partial convolution

机译：通过执行部分卷积来加速对物体检测的深度神经网络的推断

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Real-time object detection is an expected application of deep neural networks (DNNs). It can be achieved by employing graphic processing units (GPUs) or dedicated hardware accelerators. Alternatively, in this work, we present a software scheme to accelerate the inference stage of DNNs designed for object detection. The scheme relies on partial processing within the consecutive convolution layers of a DNN. It makes use of different relationships between the locations of the components of an input feature, an intermediate feature representation, and an output feature to effectively identify the modified components. This downsizes the matrix multiplicand to cover only those modified components. Therefore, matrix multiplication is accelerated within a convolution layer. In addition, the aforementioned relationships can also be employed to signal the next consecutive convolution layer regarding the modified components. This further helps reduce the overhead of the comparison on a member-by-member basis to identify the modified components. The proposed scheme has been experimentally benchmarked against a similar concept approach, namely, CBinfer, and against the original Darknet on the Tiny-You Only Look Once network. The experiments were conducted on a personal computer with dual CPU running at 3.5 GHz without GPU acceleration upon video data sets from YouTube. The results show that improvement ratios of 1.56 and 13.10 in terms of detection frame rate over CBinfer and Darknet, respectively, are attainable on average. Our scheme was also extended to exploit GPU-assisted acceleration. The experimental results of NVIDIA Jetson TX2 reached a detection frame rate of 28.12 frames per second (1.25x with respect to CBinfer). The accuracy of detection of all experiments was preserved at 90% of the original Darknet.

机译：实时对象检测是深度神经网络（DNN）的预期应用。它可以通过采用图形处理单元（GPU）或专用硬件加速器来实现。或者，在这项工作中，我们提出了一种软件方案，以加速设计用于对象检测的DNN的推断阶段。该方案依赖于DNN连续卷积层内的部分处理。它利用输入特征的组件的位置之间的不同关系，中间特征表示和输出特征，以有效地识别修改的组件。这使得矩阵多平面缩小为仅覆盖那些修改的组件。因此，矩阵乘法在卷积层内加速。另外，还可以采用上述关系来用关于修改的组件的下一个连续的卷积层发信号信号。这进一步有助于减少成员对比较的开销，以识别修改的组件。拟议的计划已经通过实验对抗类似的概念方法，即CBINFER，并反对原始DarkNet在微小的情况下，你只看一下网络。实验在个人计算机上进行，其中双CPU在3.5 GHz运行，没有GPU加速在来自YouTube的视频数据集。结果表明，在CBInfer和Darknet的检测帧速率下，分别的改善比率为1.56和13.10，平均可实现。我们的计划还扩展以利用GPU辅助加速。 NVIDIA Jetson TX2的实验结果达到每秒28.12帧的检测帧速率（相对于CBINFER 1.25倍）。所有实验的检测准确性被保存在原始Darknet的90％。

著录项

来源
《Journal of Real-Time Image Processing》 |2020年第5期|1487-1503|共17页
作者
Kurdthongmee Wattanapong;
展开▼
作者单位

Walailak Univ Sch Engn & Technol 222 Thaibury Thasala 80160 Nakhon Si Thamm Thailand;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Deep neural networks; DNNs object detection; Convolution; Inference acceleration;

机译：深神经网络;DNNS对象检测;卷积;推理加速;

相似文献

外文文献
中文文献
专利

1. Identification of Tomato Disease Types and Detection of Infected Areas Based on Deep Convolutional Neural Networks and Object Detection Techniques [J] . Qimei Wang, Feng Qi, Minghe Sun, Computational intelligence and neuroscience . 2019,第4期

机译：基于深度卷积神经网络和物体检测技术的番茄疾病类型和感染区域检测的鉴定
2. Deep convolution neural network with scene-centric and object-centric information for object detection [J] . Shen Zong-Ying, Han Shiang-Yu, Fu Li-Chen, Image and Vision Computing . 2019,第MAY期

机译：具有场景中心和对象中心信息的深度卷积神经网络用于对象检测
3. Deep convolution neural network with scene-centric and object-centric information for object detection [J] . Shen Zong-Ying, Han Shiang-Yu, Fu Li-Chen, Image and Vision Computing . 2019,第May期

机译：深度卷积神经网络与场景为中心和以对象的对象检测信息
4. Object recognition using deep convolutional neural networks with complete transfer and partial frozen layers [C] . Maarten C. Kruithof, Henri Bouma, Noeelle M. Fischer, Optics and photonics for counterterrorism, crime fighting and defence XII . 2016

机译：使用具有完全转移和部分冻结层的深度卷积神经网络进行对象识别
5. Hyperparameter Optimization of Deep Convolutional Neural Networks Architectures for Object Recognition [D] . Albelwi, Saleh. 2018

机译：深度卷积神经网络体系结构用于对象识别的超参数优化
6. Corrigendum to Identification of Tomato Disease Types and Detection of Infected Areas Based on Deep Convolutional Neural Networks and Object Detection Techniques [O] . Qimei Wang, Feng Qi, Minghe Sun, 2021

机译：基于深度卷积神经网络和物体检测技术鉴定番茄疾病类型和检测感染区域的勘探。
7. Foveated Image Processing for Faster Object Detection and Recognition in Embedded Systems Using Deep Convolutional Neural Networks [O] . Uziel Jaramillo-Avila, Sean R. Anderson 2019

机译：使用深度卷积神经网络的嵌入式系统的更快对象检测和识别的变化图像处理

Speeding up inference on deep neural networks for object detection by performing partial convolution

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅