...
首页> 外文期刊>Neural, Parallel & Scientific Computations >Multi-Threaded SIMD Implementation of the Back-Propagation Algorithm on Multi-Core Intel Xeon Processors
【24h】

Multi-Threaded SIMD Implementation of the Back-Propagation Algorithm on Multi-Core Intel Xeon Processors

机译:多核Intel Xeon处理器上反向传播算法的多线程SIMD实现

获取原文
获取原文并翻译 | 示例

摘要

A combinations of using efficient algorithms and well designed implementations leads to great high performance applications. This paper show how to make the back-propagation algorithm run faster on multi-core processors and scale to the future hardware that may have more cores and faster memory. On two dual core Intel Xeon processors, each supports Hyper-Threading technology, the performance of the multi-threaded SIMD implementation of the matrix back-propagation (MBP) algorithm gives around 20 times higher than the best conventional implementation on the same hardware. On reasonably large networks, experimental results show that the use of Intel streaming SIMD extensions, matrix blocking, loop unrolling, and multithreading on eight logical processors speed up the MBP by factors of 1.4, 1.75, 1.8, 4.6, respectively. Moreover, five single-precision floating-point operations can be performed in a single clock cycle by exploiting the memory hierarchy, by executing multiple instructions from multiple threads on multiple data (MIMD), and by selecting an efficient algorithm, which is based on matrix operations (MBP algorithm) instead of matrix-vector operations (BP algorithm).
机译:结合使用高效算法和精心设计的实现,可以实现高性能的应用。本文展示了如何使反向传播算法在多核处理器上更快地运行,以及如何扩展到可能具有更多核和更快内存的未来硬件。在两个双核Intel Xeon处理器上,每个处理器均支持超线程技术,矩阵反向传播(MBP)算法的多线程SIMD实现的性能比相同硬件上的最佳常规实现高约20倍。在相当大的网络上,实验结果表明,在八个逻辑处理器上使用Intel流式SIMD扩展,矩阵阻止,循环展开和多线程可以分别将MBP速度提高1.4、1.75、1.8、4.6。此外,通过利用内存层次结构,通过对来自多个线程的多个数据执行多条指令(MIMD)并选择基于矩阵的高效算法,可以在单个时钟周期内执行五个单精度浮点运算。运算(MBP算法),而不是矩阵向量运算(BP算法)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号