首页> 外文会议>2017 19th International Symposium on Computer Architecture and Digital Systems >High performance implementation of 2-D convolution using AVX2
【24h】

High performance implementation of 2-D convolution using AVX2

机译:使用AVX2的二维卷积的高性能实现

获取原文
获取原文并翻译 | 示例

摘要

Convolution is the most important and fundamental concept in multimedia processing. The 2-D convolution is used for different filtering operations such as sharpening, smoothing, and edge detection. It performs many mathematical operations on all image pixels. Therefore, it is almost a compute-intensive kernel. In this paper, we use Intrinsic Programming Model (IPM) and AVX2 technology to vectorize this kernel, explicitly. We compare our implementations to Compilers Automatic Vectorization (CAVs), OpenCV library and OpenMP API using ICC, GCC and LLVM compilers, on a single-core. For multi-threading, OpenMP has been used to perform IPM and CAVs implementations on multi-cores. Our experimental results show that the performance of our implementations is much higher than other approaches. In addition, OpenMP improves the performance of our explicit vectorizations significantly using ICC and GCC compilers.
机译:卷积是多媒体处理中最重要和最基本的概念。二维卷积用于不同的滤波操作,例如锐化,平滑和边缘检测。它对所有图像像素执行许多数学运算。因此,它几乎是一个计算密集型内核。在本文中,我们使用内在编程模型(IPM)和AVX2技术来显式矢量化此内核。我们在单核上将我们的实现与使用ICC,GCC和LLVM编译器的编译器自动矢量化(CAV),OpenCV库和OpenMP API进行了比较。对于多线程,OpenMP已用于在多核上执行IPM和CAV实施。我们的实验结果表明,我们的实现的性能远高于其他方法。此外,OpenMP使用ICC和GCC编译器可以显着提高显式矢量化的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号