首页> 外文会议>International Conference on Intelligent Data Acquisition and Advanced Computing Systems >Efficient parallelization of batch pattern training algorithm on many-core and cluster architectures
【24h】

Efficient parallelization of batch pattern training algorithm on many-core and cluster architectures

机译:多核和集群架构上批处理模式训练算法的高效并行化

获取原文

摘要

The experimental research of the parallel batch pattern back propagation training algorithm on the example of recirculation neural network on many-core high performance computing systems is presented in this paper. The choice of recirculation neural network among the multilayer perceptron, recurrent and radial basis neural networks is proved. The model of a recirculation neural network and usual sequential batch pattern algorithm of its training are theoretically described. An algorithmic description of the parallel version of the batch pattern training method is presented. The experimental research is fulfilled using the Open MPI, Mvapich and Intel MPI message passing libraries. The results obtained on many-core AMD system and Intel MIC are compared with the results obtained on a cluster system. Our results show that the parallelization efficiency is about 95% on 12 cores located inside one physical AMD processor for the considered minimum and maximum scenarios. The parallelization efficiency is about 70–75% on 48 AMD cores for the minimum and maximum scenarios. These results are higher by 15–36% (depending on the version of MPI library) in comparison with the results obtained on 48 cores of a cluster system. The parallelization efficiency obtained on Intel MIC architecture is surprisingly low, asking for deeper analysis.
机译:本文以多核高性能计算系统上的循环神经网络为例,对并行批处理模式反向传播训练算法进行了实验研究。证明了多层感知器,递归和径向基神经网络中循环神经网络的选择。从理论上描述了循环神经网络的模型及其训练的常规顺序批处理模式算法。给出了批处理模式训练方法的并行版本的算法描述。使用开放MPI,Mvapich和英特尔MPI消息传递库可以完成实验研究。将在多核AMD系统和Intel MIC上获得的结果与在集群系统上获得的结果进行比较。我们的结果表明,在考虑的最小和最大场景下,位于一个物理AMD处理器内的12个内核的并行化效率约为95%。在最小和最大场景下,在48个AMD内核上,并行化效率约为70-75%。与在集群系统的48个内核上获得的结果相比,这些结果要高出15–36%(取决于MPI库的版本)。英特尔MIC架构上获得的并行化效率出奇地低,需要更深入的分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号