首页> 外文学位 >Development of load balancing algorithm based on analysis of multi-core architecture on a Beowulf cluster.
【24h】

Development of load balancing algorithm based on analysis of multi-core architecture on a Beowulf cluster.

机译:基于Beowulf集群上基于多核架构的负载均衡算法的开发。

获取原文
获取原文并翻译 | 示例

摘要

In this work, analysis, and modeling were employed to improve the Linux Scheduler for HPC use. The performance throughput of a single compute-node of the 23 node Beowulf cluster, Virgo 2.0, was analyzed to find bottlenecks and limitations that affected performance in the processing hardware where each compute-node consisted of two quad-core processors with eight gigabytes of memory. The analysis was performed using the High Performance Linpack (HPL) benchmark.;In addition, the processing hardware of the compute-node was modeled using an Instruction per Cycle (IPC) metric that was estimated using linear regression. Modeling data was obtained by using the Tuning CacheEdge program, which is part of the ATLAS libraries, and collected using the PerfMonitor program. The model presented a peak IPC throughput and higher Level 2 (L2)-cache memory hit rate with a five thread concurrency for the eight processing cores.;Modifications were made to the Linux Scheduler in order to improve the performance throughput using the results obtained from the hardware analysis and model which indicated potential bottlenecks at the processor Front-Side Busses (FSBes), Memory Controller Hub (MCH), and L2-caches. The modifications included: changing policy of tasks, grouping runnable tasks, load balancing with affinity assignment of the task groups, and control of process termination and feedback.;The results showed that this approach helped to improve performance throughput since the load balancing approach created a higher L2-cache awareness, with increased hit rate, while reducing the number of times processes accessed the FSB and MCH during execution. Performance throughput peaked with block sizes of 64 and 128 for different matrix size and problem sizes, however as problem and block sizes increased, the performance throughput decreased due to hardware contentions found in the FSBes and MCH. The peak was due to the matching of the block sizes with the data width of the FSBes and MCH.
机译:在这项工作中,通过分析和建模来改进Linux计划程序以供HPC使用。对23个节点的Beowulf群集Virgo 2.0的单个计算节点的性能吞吐量进行了分析,以发现影响处理硬件性能的瓶颈和局限性,其中每个计算节点由两个具有8 GB内存的四核处理器组成。分析是使用高性能Linpack(HPL)基准进行的;此外,计算节点的处理硬件是使用每周期的指令(IPC)指标建模的,该指标是使用线性回归估算的。使用作为ATLAS库一部分的Tuning CacheEdge程序获取建模数据,并使用PerfMonitor程序收集建模数据。该模型展示了最高的IPC吞吐量和更高的2级(L2)高速缓存内存命中率,并为8个处理核心提供了5个线程并发性;对Linux Scheduler进行了修改,以使用从硬件分析和模型表明了处理器前端总线(FSB),内存控制器中枢(MCH)和L2缓存的潜在瓶颈。修改内容包括:更改任务策略,对可运行任务进行分组,通过任务组的亲和力分配进行负载平衡以及控制流程的终止和反馈。结果表明,由于负载平衡方法创建了一个新的负载平衡方法,因此该方法有助于提高性能吞吐量。更高的二级缓存意识,增加了命中率,同时减少了在执行过程中进程访问FSB和MCH的次数。对于不同的矩阵大小和问题大小,性能吞吐量以64和128的块大小达到峰值,但是,随着问题和块大小的增加,由于FSB和MCH中存在硬件争用,因此性能吞吐量下降。该峰值是由于块大小与FSBes和MCH的数据宽度匹配所致。

著录项

  • 作者

    Valles, Damian.;

  • 作者单位

    The University of Texas at El Paso.;

  • 授予单位 The University of Texas at El Paso.;
  • 学科 Engineering Computer.;Engineering Electronics and Electrical.;Computer Science.
  • 学位 Ph.D.
  • 年度 2011
  • 页码 147 p.
  • 总页数 147
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 语言学;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号