首页> 外文会议>IEEE International Symposium on Parallel Distributed Processing;IPDPS 2009 >Transitive closure on the cell broadband engine: A study on self-scheduling in a multicore processor
【24h】

Transitive closure on the cell broadband engine: A study on self-scheduling in a multicore processor

机译:单元宽带引擎上的传递闭环:多核处理器中的自调度研究

获取原文

摘要

In this paper, we present a mapping methodology and optimizations for solving transitive closure on the Cell multicore processor. Using our approach, it is possible to achieve near peak performance for transitive closure on the Cell processor. We first parallelize the Standard Floyd Warshall algorithm and show through analysis and experimental results that data communication is a bottleneck for performance and scalability. We parallelize a cache optimized version of Floyd Warshall algorithm to remove the memory bottleneck. As is the case with several scientific computing and industrial applications on a multicore processor, synchronization and scheduling of the cores plays a crucial role in determining the performance of this algorithm. We define a self-scheduling mechanism for the cores of a multicore processor and design a self-scheduler for Blocked Floyd Warshall algorithm on the Cell multicore processor to remove the scheduling bottleneck. We also present optimizations in scheduling order to remove synchronization points. Our implementations achieved up to 78GFLOPS.
机译:在本文中,我们提出了一种映射方法和优化方法来解决Cell多核处理器上的传递闭包。使用我们的方法,可以在Cell处理器上实现瞬时关闭的接近峰值性能。我们首先并行化标准Floyd Warshall算法,并通过分析和实验结果表明,数据通信是性能和可伸缩性的瓶颈。我们并行化了Floyd Warshall算法的缓存优化版本,以消除内存瓶颈。就像多核处理器上的一些科学计算和工业应用一样,核的同步和调度在确定该算法的性能方面起着至关重要的作用。我们为多核处理器的内核定义了一种自调度机制,并在Cell多核处理器上为Blocked Floyd Warshall算法设计了一个自调度程序,以消除调度瓶颈。我们还提出了按计划顺序进行优化以删除同步点。我们的实现达到了78GFLOPS。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号