【24h】

Parallel application memory scheduling

机译:并行应用程序内存调度

获取原文

摘要

A primary use of chip-multiprocessor (CMP) systems is to speed up a single application by exploiting thread-level parallelism. In such systems, threads may slow each other down by issuing memory requests that interfere in the shared memory subsystem. This inter-thread memory system interference can significantly degrade parallel application performance. Better memory request scheduling may mitigate such performance degradation. However, previously proposed memory scheduling algorithms for CMPs are designed for multi-programmed workloads where each core runs an independent application, and thus do not take into account the inter-dependent nature of threads in a parallel application. In this paper, we propose a memory scheduling algorithm designed specifically for parallel applications. Our approach has two main components, targeting two common synchronization primitives that cause inter-dependence of threads: locks and barriers. First, the runtime system estimates threads holding the locks that cause the most serialization as the set of limiter threads, which are prioritized by the memory scheduler. Second, the memory scheduler shuffles thread priorities to reduce the time threads take to reach the barrier.We show that our memory scheduler speeds up a set of memory-intensive parallel applications by 12.6% compared to the best previous memory scheduling technique.
机译:芯片多处理器(CMP)系统的主要用途是通过利用线程级并行性来加速单个应用程序。在这样的系统中,线程可能会通过发出干扰共享内存子系统的内存请求而彼此降低速度。线程间内存系统的这种干扰会大大降低并行应用程序的性能。更好的内存请求调度可以减轻这种性能下降。但是,先前提出的用于CMP的内存调度算法是为多程序工作负载而设计的,其中每个内核都运行一个独立的应用程序,因此没有考虑并行应用程序中线程的相互依赖性质。在本文中,我们提出了一种专为并行应用程序设计的内存调度算法。我们的方法有两个主要组成部分,针对两个导致线程相互依赖的常见同步原语:锁和屏障。首先,运行时系统将持有最多导致序列化的锁的线程估计为限制器线程集,这些限制器由内存调度程序确定优先级。其次,内存调度程序会改组线程优先级,以减少线程到达障碍所需的时间。我们证明,与以前的最佳内存调度技术相比,我们的内存调度程序可将一组内存密集型并行应用程序的速度提高12.6%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号