首页> 外文期刊>Parallel Computing >Architectural support for thread communications in multi-core processors
【24h】

Architectural support for thread communications in multi-core processors

机译:对多核处理器中的线程通信的体系结构支持

获取原文
获取原文并翻译 | 示例

摘要

In the ongoing quest for greater computational power, efficiently exploiting parallelism is of paramount importance. Architectural trends have shifted from improving single-threaded application performance, often achieved through instruction level parallelism (ILP), to improving multithreaded application performance by supporting thread level parallelism (TLP). Thus, multi-core processors incorporating two or more cores on a single die have become ubiquitous. To achieve concurrent execution on multi-core processors, applications must be explicitly restructured to exploit parallelism, either by programmers or compilers. However, multithreaded parallel programming may introduce overhead due to communications among threads. Though some resources are shared among processor cores, current multi-core processors provide no explicit communications support for multithreaded applications that takes advantage of the proximity between cores. Currently, inter-core communications depend on cache coherence, resulting in demand-based cache line transfers with their inherent latency and overhead. In this paper, we explore two approaches to improve communications support for multithreaded applications. Prepushing is a software controlled data forwarding technique that sends data to destination's cache before it is needed, eliminating cache misses in the destination's cache as well as reducing the coherence traffic on the bus. Software Controlled Eviction (SCE) improves thread communications by placing shared data in shared caches so that it can be found in a much closer location than remote caches or main memory. Simulation results show significant performance improvement with the addition of these architecture optimizations to multi-core processors.
机译:在不断寻求更大的计算能力时,有效利用并行性至关重要。架构趋势已从提高单线程应用程序性能(通常通过指令级并行性(ILP)实现)转变为通过支持线程级并行性(TLP)来提高多线程应用程序性能。因此,在单个裸片上结合了两个或多个内核的多核处理器已经无处不在。为了在多核处理器上实现并发执行,必须由程序员或编译器对应用程序进行显式重构,以利用并行性。但是,由于线程之间的通信,多线程并行编程可能会带来开销。尽管一些资源在处理器内核之间共享,但是当前的多核处理器没有为利用内核之间接近性的多线程应用程序提供显式通信支持。当前,核心间通信取决于缓存一致性,导致基于需求的缓存行传输及其固有的延迟和开销。在本文中,我们探索了两种方法来改善对多线程应用程序的通信支持。预推送是一种软件控制的数据转发技术,可在需要之前将数据发送到目标的缓存,从而消除了目标缓存中的缓存未命中,并减少了总线上的一致性流量。通过将共享数据放置在共享缓存中,软件控制驱逐(SCE)改善了线程通信,因此与远程缓存或主内存相比,可以在更近的位置找到它。仿真结果表明,将这些架构优化添加到多核处理器中后,性能将得到显着改善。

著录项

  • 来源
    《Parallel Computing》 |2011年第1期|p.26-41|共16页
  • 作者

    Sevin Varoglu; Stephen Jenks;

  • 作者单位

    Department of Electrical Engineering and Computer Science, University of California, Irvine, USA;

    Department of Electrical Engineering and Computer Science, University of California, Irvine, USA;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    multi-core processors; parallelism and concurrency; shared memory;

    机译:多核处理器;并行与并发;共享内存;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号