Multi-Threaded SIMD Implementation of the Back-Propagation Algorithm on Multi-Core Intel Xeon Processors

Mostafa I. Soliman

首页> 外文期刊>Neural, Parallel & Scientific Computations >Multi-Threaded SIMD Implementation of the Back-Propagation Algorithm on Multi-Core Intel Xeon Processors

【24h】

Multi-Threaded SIMD Implementation of the Back-Propagation Algorithm on Multi-Core Intel Xeon Processors

机译：多核Intel Xeon处理器上反向传播算法的多线程SIMD实现

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

A combinations of using efficient algorithms and well designed implementations leads to great high performance applications. This paper show how to make the back-propagation algorithm run faster on multi-core processors and scale to the future hardware that may have more cores and faster memory. On two dual core Intel Xeon processors, each supports Hyper-Threading technology, the performance of the multi-threaded SIMD implementation of the matrix back-propagation (MBP) algorithm gives around 20 times higher than the best conventional implementation on the same hardware. On reasonably large networks, experimental results show that the use of Intel streaming SIMD extensions, matrix blocking, loop unrolling, and multithreading on eight logical processors speed up the MBP by factors of 1.4, 1.75, 1.8, 4.6, respectively. Moreover, five single-precision floating-point operations can be performed in a single clock cycle by exploiting the memory hierarchy, by executing multiple instructions from multiple threads on multiple data (MIMD), and by selecting an efficient algorithm, which is based on matrix operations (MBP algorithm) instead of matrix-vector operations (BP algorithm).

机译：结合使用高效算法和精心设计的实现，可以实现高性能的应用。本文展示了如何使反向传播算法在多核处理器上更快地运行，以及如何扩展到可能具有更多核和更快内存的未来硬件。在两个双核Intel Xeon处理器上，每个处理器均支持超线程技术，矩阵反向传播（MBP）算法的多线程SIMD实现的性能比相同硬件上的最佳常规实现高约20倍。在相当大的网络上，实验结果表明，在八个逻辑处理器上使用Intel流式SIMD扩展，矩阵阻止，循环展开和多线程可以分别将MBP速度提高1.4、1.75、1.8、4.6。此外，通过利用内存层次结构，通过对来自多个线程的多个数据执行多条指令（MIMD）并选择基于矩阵的高效算法，可以在单个时钟周期内执行五个单精度浮点运算。运算（MBP算法），而不是矩阵向量运算（BP算法）。

著录项

来源
《Neural, Parallel & Scientific Computations 》 |2007年第2期| 253-268| 共16页
作者
Mostafa I. Soliman;
展开▼
作者单位

Computer & Control Section, Electrical Engineering Department, Faculty of Engineering, South Valley University, Aswan, Arab Republic of Egypt;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术 ;
关键词
multi-core computation; multi-threaded implementation; reusing cached data; streaming SIMD extensions; back-propagation algorithm; neural computation;

机译：多核计算;多线程实现;重用缓存的数据;SIMD扩展流;反向传播算法;神经计算;

相似文献

外文文献
中文文献
专利

1. Evalulation of parallel efficiency for NONMEM, using multi-core processors and Intel Xeon Phi Co-processors [J] . Jang Dooyeon, Han Seunghoon, Kim Min-gul, Journal of pharmacokinetics and pharmacodynamics . 2016 ,第Suppla期

机译：使用多核处理器和英特尔至强融核协处理器评估NONMEM的并行效率
2. Parallel BRDF-based infrared radiation simulation of aerial targets implemented on Intel Xeon processor and Xeon Phi coprocessor [J] . Guo Xing, Wu Zhensen, Wu Jiaji, Journal of Real-Time Image Processing . 2019 ,第1期

机译：在英特尔至强处理器和至强融核协处理器上实现的基于BRDF的空中目标的并行红外辐射仿真
3. High-performance SIMD implementation of the lattice-Boltzmann method on the Xeon Phi processor [J] . Fredrik Robertsen, Keijo Mattila, JanWesterholm Concurrency, practice and experience . 2019 ,第13期

机译：至强融核处理器上格子-玻尔兹曼方法的高性能SIMD实现
4. Exploring SIMD for Molecular Dynamics, Using Intel® Xeon® Processors and Intel® Xeon Phi Coprocessors [C] . IEEE International Parallel Distributed Processing Symposium . 2013

机译：使用英特尔®至强®处理器和英特尔®至强融核协处理器探索分子动力学的SIMD
5. AN INTEL 8080 MICROPROCESSOR IMPLEMENTATION OF A NAVIGATION ALGORITHM FOR THE LOW-COST GPS(+ GLOBAL POSITIONING SYSTEM) RECEIVER. [D] . SURARATRUNGSI, SONGSUK. 1977

机译：用于低成本GPS（+全球定位系统）接收器的导航算法的Intel 8080微处理器实现。
6. A Parallel Architecture for the Partitioning around Medoids (PAM) Algorithm for Scalable Multi-Core Processor Implementation with Applications in Healthcare [O] . Hassan Mushtaq, Sajid Gul Khawaja, Muhammad Usman Akram, 2018

机译：围绕Medoids（PAM）算法进行分区的并行体系结构可实现可扩展的多核处理器及其在医疗保健中的应用
7. Exploring SIMD for molecular dynamics, using Intel Xeon processors and Intel Xeon Phi coprocessors [O] . Pennycook, Simon J., Hughes, C. J., Smelyanskiy, M., 2013

机译：使用Intel Xeon处理器和Intel Xeon Phi协处理器探索SIMD的分子动力学

Multi-Threaded SIMD Implementation of the Back-Propagation Algorithm on Multi-Core Intel Xeon Processors

摘要

著录项

相似文献

相关主题

期刊订阅