首页> 外文会议>Principles and practice of parallel programming >Thread to Strand Binding of Parallel Network Applications in Massive Multi-Threaded Systems
【24h】

Thread to Strand Binding of Parallel Network Applications in Massive Multi-Threaded Systems

机译:大规模多线程系统中并行网络应用程序的线程到线程绑定

获取原文

摘要

In processors with several levels of hardware resource sharing, like CMPs in which each core is an SMT, the scheduling process becomes more complex than in processors with a single level of resource sharing, such as pure-SMT or pure-CMP processors. Once the operating system selects the set of applications to simultaneously schedule on the processor (workload), each application/thread must be assigned to one of the hardware contexts (strands). We call this last scheduling step the Thread to Strand Binding or TSB. In this paper, we show that the TSB impact on the performance of processors with several levels of shared resources is high. We measure a variation of up to 59% between different TSBs of real multithreaded network applications running on the UltraSPARC T2 processor which has three levels of resource sharing. In our view, this problem is going to be more acute in future multithreaded architectures comprising more cores, more contexts per core, and more levels of resource sharing.We propose a resource-sharing aware TSB algorithm (TSBSched) that significantly facilitates the problem of thread to strand binding for software-pipelined applications, representative of multithreaded network applications. Our systematic approach encapsulates both, the characteristics of multithreaded processors under the study and the structure of the software pipelined applications. Once calibrated for a given processor architecture, our proposal does not require hardware knowledge on the side of the programmer, nor extensive profiling of the application. We validate our algorithm on the UltraSPARC T2 processor running a set of real multithreaded network applications on which we report improvements of up to 46% compared to the current state-of-the-art dynamic schedulers.
机译:在具有多个级别的硬件资源共享的处理器(例如CMP中每个内核都是SMT)中,调度过程变得比在具有单个级别的资源共享的处理器(例如,纯SMT或纯CMP处理器)中更为复杂。一旦操作系统选择了要在处理器上同时调度的一组应用程序(工作量),则每个应用程序/线程都必须分配给一个硬件上下文(链)。我们将此最后的调度步骤称为“线程到链绑定”或TSB。在本文中,我们表明TSB对具有多个共享资源级别的处理器的性能影响很大。我们测量运行在具有三级资源共享的UltraSPARC T2处理器上的实际多线程网络应用程序的不同TSB之间的差异高达59%。我们认为,在未来的多线程体系结构中,包括更多内核,每个内核更多上下文以及更多级别的资源共享,这个问题将变得更加尖锐。 我们提出了一种资源共享感知的TSB算法(TSBSched),该算法大大简化了软件流水线应用程序(代表多线程网络应用程序)的线程到线程绑定的问题。我们的系统方法既涵盖了研究中的多线程处理器的特征,也涵盖了软件流水线应用程序的结构。一旦针对给定的处理器体系结构进行了校准,我们的建议就不需要程序员方面的硬件知识,也不需要对应用程序进行广泛的性能分析。我们在运行一组实际多线程网络应用程序的UltraSPARC T2处理器上验证了我们的算法,与当前最新的动态调度程序相比,在这些应用程序上我们报告的性能提高了46%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号