首页> 外文期刊>Concurrency and computation: practice and experience >LBMA and IMAR~2:Weighted lottery based migration strategies forNUMAmultiprocessing servers
【24h】

LBMA and IMAR~2:Weighted lottery based migration strategies forNUMAmultiprocessing servers

机译:LBMA和IMAR〜2:基于加权彩票的迁移策略Fornumultiprocessing Servers

获取原文
获取原文并翻译 | 示例

摘要

Multicore NUMA systems present on-board memory hierarchies and communication networks that influence performance when executing shared memory parallel codes. Characterizing this influence is complex, and understanding the effect of particular hardware configurations on different codes is of paramount importance. In this article, monitoring information extracted from hardware counters at runtime is used to characterize the behavior of each thread for an arbitrary number of multithreaded processes running in a multiprocessing environment. This characterization is given in terms of number of operations per second, operational intensity, and latency of memory accesses. We propose a runtime tool, executed in user space, that uses this information to guide two different thread migration strategies for improving execution efficiency by increasing locality and affinity without requiring any modification in the running codes. Different configurations of NAS Parallel OpenMP benchmarks running concurrently on multicore NUMA systems were used to validate the benefits of our proposal, in which up to four processes are running simultaneously. In more than the 95% of the executions of our tool, results outperform those of the operating system (OS) and produces up to 38% improvement in execution time over the OS for heterogeneous workloads, under different and realistic locality and affinity scenarios.
机译:多核Numa系统存在在执行共享内存并行码时影响性能的板内存层次结构和通信网络。表征这种影响是复杂的,并且了解特定硬件配置对不同代码的影响是至关重要的。在本文中,在运行时从硬件计数器中提取的监视信息用于表征每个线程的行为,用于在多处理环境中运行的任意数量的多线程进程。根据每秒操作数,操作强度和存储器访问的延迟给出这种表征。我们提出了一个运行时工具,在用户空间中执行,它使用此信息来指导两个不同的线程迁移策略,以通过增加局部性和关联来提高执行效率,而无需在运行代码中进行任何修改。在Multicore Numa系统上同时运行的NAS并行OpenMP基准的不同配置用于验证我们提案的优势,最多四个进程同时运行。在超过95%的工具执行中,结果优于操作系统(OS)的结果,在不同和现实的位置和亲和力方案下,在操作系统上产生高达38%的执行时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号