首页> 外文会议>IEEE International Conference on High Performance Computing and Communication >Architectural Support for Exploiting Fine Grain Parallelism
【24h】

Architectural Support for Exploiting Fine Grain Parallelism

机译:利用细粒行活的建筑支持

获取原文

摘要

The advent of multi-core processors, particularly with projections that numbers of cores will continue to increase, has focused attention on parallel programming. It is widely recognized that current programming techniques, including those that are used for scientific parallel programming, will not allow the easy formulation of general purpose applications. An area which is receiving interest is the use of programming styles which are side-effect free. Previous work on parallel functional programming demonstrated the potential of this to permit the easy exploitation of parallelism. Recent systems like Cilk use conventional languages such as C but encourage the use of a largely functional style (side-effect free) when writing programs. An important part of the Cilk runtime is a system to balance the usage of cores. In this paper we present SLAM (Spreading Load with Active Messages), a dynamic load balancing system based on functional language evaluation techniques. We show that SLAM, provided with appropriate hardware support, significantly outperforms the Cilk system. We evaluated our system using tiled CMPs with private and shared L2 caches separately. Our results show that, for the benchmarks evaluated, SLAM outperforms Cilk by 28% on average when using 32-core CMPs with private L2 caches. For the case of the CMPs with shared L2 caches, SLAM was on average 21% faster than Cilk when using 32 cores and 62% faster when using 64 cores.
机译:多核处理器的出现,特别是在核心数量将继续增加的预测,已经关注并行编程。众所周知,当前的编程技术,包括用于科学并行编程的编程技术,不会允许容易地配制通用应用。接受兴趣的区域是使用副作用的编程样式。以前的并行功能编程的工作证明了这一点,允许轻松地利用并行性。最近的系统,如Cilk使用C等传统语言,但在编写程序时鼓励使用大量功能的风格(免费副作用)。 Cilk运行时的重要部分是一个平衡核心使用的系统。在本文中,我们呈现SLAM(带有活动消息的传播负载),是一种基于功能语言评估技术的动态负载平衡系统。我们展示了具有适当硬件支持的奴役,显着优于CILK系统。我们使用私有和共享L2缓存分别使用瓷砖CMPS评估我们的系统。我们的结果表明,对于评估的基准,当使用带有私人L2缓存的32核心CMP时,平均速度达到28%的Slam。对于具有共享L2高速缓存的CMP的情况,当使用32个核心时,SLAM平均比CILK更快21%,使用64个核心时速度快62%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号