首页> 外文会议>IEEE International Symposium on Parallel and Distributed Processing >Using Memory Access Traces to Map Threads and Data on Hierarchical Multi-core Platforms
【24h】

Using Memory Access Traces to Map Threads and Data on Hierarchical Multi-core Platforms

机译:使用内存访问跟踪来映射分层多核平台上的线程和数据

获取原文

摘要

In parallel programs, the tasks of a given application must cooperate in order to accomplish the required computation. However, the communication time between the tasks may be different depending on which core they are executing and how the memory hierarchy and interconnection are used. The problem is even more important in multi-core machines with NUMA characteristics, since the remote access imposes high overhead, making them more sensitive to thread and data mapping. In this context, process mapping is a technique that provides performance gains by improving the use of resources such as interconnections, main memory and cache memory. The problem of detecting the best mapping is considered NP-Hard. Furthermore, in shared memory environments, there is an additional difficulty of finding the communication pattern, which is implicit and occurs through memory accesses. This work aims to provide a method for static mapping for NUMA architectures which does not require any prior knowledge of the application. Different metrics were adopted and an heuristic method based on the Edmonds matching algorithm was used to obtain the mapping. In order to evaluate our proposal, we use the NAS Parallel Benchmarks (NPB) and two modern multi-core NUMA machines. Results show performance gains of up to 75% compared to the native scheduler and memory allocator of the operating system.
机译:在并行程序中,给定应用程序的任务必须协作以完成所需的计算。然而,取决于它们正在执行的核心以及如何使用存储器层次结构和互连,可以不同地不同。由于远程访问强加高开销,因此该问题在具有Numa特性的多核机器中更为重要,因此对线程和数据映射更敏感。在此上下文中,过程映射是一种通过改进诸如互连,主存储器和高速缓冲存储器的资源的使用来提供性能增益。检测到最佳映射的问题被认为是NP-HARD。此外,在共享的内存环境中,存在额外的难度来查找通信模式,其是隐式的并且通过存储器访问发生。这项工作旨在为NUMA架构提供一种方法,该方法不需要任何先前的应用程序先验知识。采用了不同的指标,并使用基于edmonds匹配算法的启发式方法来获得映射。为了评估我们的建议,我们使用NAS并联基准(NPB)和两个现代多核Numa机器。结果显示与操作系统的本机调度程序和内存分配器相比的性能增益高达75%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号