首页> 外文期刊>Journal of Parallel and Distributed Computing >Efficient and scalable scheduling for performance heterogeneous multicore systems
【24h】

Efficient and scalable scheduling for performance heterogeneous multicore systems

机译:性能异构多核系统的高效且可扩展的调度

获取原文
获取原文并翻译 | 示例

摘要

Performance heterogeneous multicore processors (HMP for brevity) consisting of multiple cores with the same instruction set but different performance characteristics (e.g., clock speed, issue width), are of great concern since they are able to deliver higher performance per watt and area for programs with diverse architectural requirements than comparable homogeneous ones. However, such power and area efficiencies of performance heterogeneous multicore systems can only be achieved when workloads are matched with cores according to both the properties of the workload and the features of the cores. Several heterogeneity-aware schedulers were proposed in the previous work. In terms of whether properties of workloads are obtained online or not, those scheduling algorithms can be categorized into two classes: online monitoring and offline profiling. The previous online monitoring approaches had to trace threads' execution on all core types, which is impractical as the number of core types grows. Besides, to trace all core types threads have to be migrated among cores, which may cause load imbalance and degrade the performance. The existing offline profiling approaches profile programs with a given input set before really executing them and thus remove the overhead associated with the number of core types. However, offline profiling approaches do not account for phase changes of threads. Moreover, since the properties they have collected are based on the given input set, those offline profiling approaches are hard to adapt to various input sets and therefore will drastically affect the program performance. To address the above problems in the existing approaches, we propose a new technique, ASTPI (Average Stall Time Per Instruction), to measure the efficiencies of threads in using fast cores. We design, implement and evaluate a new online monitoring approach called ESHMP, which is based on the metric. Our evaluation in the Linux 2.6.21 operating system shows that ESHMP delivers scalability while adapting to a wide variety of applications. Also, our experiment results show that among HMP systems in which heterogeneity-aware schedulers are adopted and there are more than one LLC (Last Level Cache), the architecture where heterogeneous cores share LLCs gain better performance than the ones where homogeneous cores share LLCs.
机译:高性能的异构多核处理器(由HMP组成)由具有相同指令集但具有不同性能特征(例如,时钟速度,问题宽度)的多个内核组成,因为它们能够为程序提供每瓦特和每区域更高的性能具有比同类同类产品更高的架构要求。但是,仅当根据工作负载的属性和核心的功能将工作负载与核心进行匹配时,才能实现性能异构多核系统的这种功率和面积效率。在先前的工作中提出了几种了解异构性的调度程序。根据是否在线获取工作负载的属性,可以将这些调度算法分为两类:在线监视和离线分析。以前的在线监视方法必须跟踪所有核心类型上的线程执行,随着核心类型数量的增加,这是不切实际的。此外,要跟踪所有核心类型,必须在核心之间迁移线程,这可能会导致负载不平衡并降低性能。现有的脱机概要分析在真正执行给定输入集之前对它们进行概要分析,从而消除了与核心类型数量相关的开销。但是,脱机分析方法不能解决线程的相变问题。此外,由于他们收集的属性基于给定的输入集,因此那些脱机分析方法很难适应各种输入集,因此将严重影响程序性能。为了解决现有方法中的上述问题,我们提出了一种新技术ASTPI(每条指令平均停顿时间),以测量使用快速内核的线程效率。我们基于该指标设计,实施和评估一种称为ESHMP的新在线监视方法。我们在Linux 2.6.21操作系统上的评估表明,ESHMP在可扩展性的同时可适应各种应用程序。此外,我们的实验结果表明,在采用异构感知调度器且有多个LLC(Last Level Cache)的HMP系统中,异构内核共享LLC的体系结构比同类内核共享LLC的体系结构具有更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号