首页> 外文期刊>Journal of signal processing systems for signal, image, and video technology >A Parametrizable High-Level Synthesis Library for Accelerating Neural Networks on FPGAs
【24h】

A Parametrizable High-Level Synthesis Library for Accelerating Neural Networks on FPGAs

机译:用于加速FPGA的神经网络的参数化高级合成库

获取原文
获取原文并翻译 | 示例
           

摘要

In recent years, Convolutional Neural Network CNN have been incorporated in a large number of applications, including multimedia retrieval and image classification. However, CNN based algorithms are computationally and resource intensive and therefore difficult to be used in embedded systems. FPGA based accelerators are becoming more and more popular in research and industry due to their flexibility and energy efficiency. However, the available resources and the size of the on-chip memory can limit the performance of the FPGA accelerator for CNN. This work proposes an High-Level Synthesis HLS library for CNN algorithms. It contains seven different streaming-capable CNN (plus two conversion) functions for creating large neural networks with deep pipelines. The different functions have many parameter settings (e.g. for resolution, feature maps, data types, kernel size, parallelilization, accuracy, etc.), which also enable compile-time optimizations. Our functions are integrated into the HiFlipVX library, which is an open source HLS FPGA library for image processing and object detection. This offers the possibility to implement different types of computer vision applications with one library. Due to the various configuration and parallelization possibilities of the library functions, it is possible to implement a high-performance, scalable and resource-efficient system, as our evaluation of the MobileNets algorithm shows.
机译:近年来,卷积神经网络CNN已经结合在大量应用中,包括多媒体检索和图像分类。然而,基于CNN的算法是计算和资源密集的,因此难以在嵌入式系统中使用。由于其灵活性和能源效率,基于FPGA的加速器在研究和工业中变得越来越受欢迎。但是,可用的资源和片上存储器的大小可以限制CNN的FPGA加速器的性能。这项工作提出了一种用于CNN算法的高级合成HLS库。它包含七种不同的流式CNN(加上两个转换)功能,用于创建具有深管内的大型神经网络。不同的功能具有许多参数设置(例如,用于分辨率,特征映射,数据类型,内核大小,并行化,准确度等),也可以实现编译时间优化。我们的功能集成到HIFLIPvx库中,是用于图像处理和对象检测的开源HLS FPGA库。这提供了通过一个库实现不同类型的计算机视觉应用程序的可能性。由于图书馆功能的各种配置和并行化可能性,可以实现高性能,可扩展和资源有效的系统,因为我们对MobileCenets算法的评估显示。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号