...
首页> 外文期刊>Computer Languages, Systems & Structures >Effective Implementation of Matrix-Vector Multiplication on Intel's AVX multicore Processor
【24h】

Effective Implementation of Matrix-Vector Multiplication on Intel's AVX multicore Processor

机译:英特尔AVX多核处理器上矩阵矢量乘法的有效实现

获取原文
获取原文并翻译 | 示例
           

摘要

Matrix-vector multiplication kernel is one of the most important and common computational operations which form the core of varied important application areas such as scientific and engineering applications. Therefore, it is substantial to optimize and accelerate its implementation. This paper proposes an optimized algorithm for single-precision matrix vector multiplication (SGEMV) on the Intel core i7 processor. An overview of the Intel's advanced vector extension instructions in implementing dense matrix-vector multiplication kernels in parallel has been comprehensively addressed. Also, a variety of performance optimization techniques using Intel's advanced vector extension (AVX) instruction sets, memory access optimization, and OpenMP parallelization has been designed. Additionally, the performance of the proposed algorithms is evaluated in compared to the latest version of Intel Math Kernel Library SGEMV 2017 subroutines because Intel Math Kernel Library subroutines also consider the same optimization methods that are used in this paper. In this paper, we have introduced an overview of the optimization techniques, have explained the specific details of handling them in the proposed algorithm, and also have showed the advantages and the challenges of combining them together in contrast to the previous works which usually have concentrated on a single technique and the performance achieved by it. The guidelines of parallel implementation of the proposed algorithm and the characteristics of the target architecture that should be considered when implementing this algorithm have been investigated. An overview of the Intel's advanced vector extension instructions in implementing dense matrix-vector multiplication kernels in parallel has been comprehensively addressed. A comparative study of the two most popularly used C++ compilers: Intel C++ compiler 17.0 in Intel Parallel Studio XE 2017 against Microsoft Visual Studio C++ compiler 2015 has been investigated. Finally, the comparison between two primary ways of utilizing AVX instructions: inline assembly and intrinsic functions, and the comparison between single-core and multi-core platforms have introduced. The results are evaluated in Intel Core i7-5600U processor of 2.6 GHz with 128KB L1 cache, 512 KB L2 cache, and 4MB L3 cache running on windows 10 operating system and on a Broadwell system. The obtained results of the proposed optimized algorithm are implemented on square matrices of different large sizes range from 1024 to 19456. The results indicate a performance improvement of 18.2% and 14.1% for (y = A. x) and (y = A(T). x) respectively in compared with the results which are obtained using the latest version of Intel Math Kernel Library 2017(SGEMV) subroutines on multi-core platform. (C) 2017 Elsevier Ltd. All rights reserved.
机译:矩阵矢量乘法内核是最重要且最常见的计算操作之一,它构成了各种重要应用领域(如科学和工程应用)的核心。因此,优化和加速其实施非常重要。本文针对Intel Core i7处理器的单精度矩阵矢量乘法(SGEMV)提出了一种优化算法。英特尔解决了在并行实现密集矩阵矢量乘法内核方面的高级矢量扩展指令的概述。此外,还设计了使用英特尔高级矢量扩展(AVX)指令集,内存访问优化和OpenMP并行化的各种性能优化技术。此外,与最新版本的Intel Math Kernel Library SGEMV 2017子例程相比,评估了拟议算法的性能,因为Intel Math Kernel Library子例程还考虑了与本文使用的相同优化方法。在本文中,我们对优化技术进行了概述,解释了所提出算法中处理这些优化技术的具体细节,并且还展示了与以往通常集中精力进行的工作相比,将它们组合在一起的优点和挑战。单一技术及其实现的性能。研究了所提出算法的并行实现准则以及实现该算法时应考虑的目标体系结构的特征。英特尔解决了在并行实现密集矩阵矢量乘法内核方面的高级矢量扩展指令的概述。已对两种最常用的C ++编译器进行了比较研究:调查了Intel Parallel Studio XE 2017中的Intel C ++编译器17.0与Microsoft Visual Studio C ++编译器2015。最后,介绍了使用AVX指令的两种主要方式之间的比较:内联汇编和内部函数,以及单核和多核平台之间的比较。在2.6 GHz的Intel Core i7-5600U处理器,运行于Windows 10操作系统和Broadwell系统上的128KB L1缓存,512 KB L2缓存和4MB L3缓存中评估了结果。所提出的优化算法的结果在1024至19456的不同大尺寸平方矩阵上实现。结果表明(y = A. x)和(y = A(T)的性能提高了18.2%和14.1% )。x)分别与在多核平台上使用最新版本的Intel Math Kernel Library 2017(SGEMV)子例程获得的结果进行比较。 (C)2017 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号