...
首页> 外文期刊>Computer architecture news >Can Traditional Programming Bridge the Ninja Performance Gap for Parallel Computing Applications?
【24h】

Can Traditional Programming Bridge the Ninja Performance Gap for Parallel Computing Applications?

机译:传统编程能否弥补并行计算应用程序的忍者性能差距?

获取原文
获取原文并翻译 | 示例
           

摘要

Current processor trends of integrating more cores with wider SIMD units, along with a deeper and complex memory hierarchy, have made it increasingly more challenging to extract performance from applications. It is believed by some that traditional approaches to programming do not apply to these modern processors and hence radical new languages must be discovered. In this paper, we question this thinking and offer evidence in support of traditional programming methods and the performance-vs-programming effort effectiveness of common multi-core processors and upcoming many-core architectures in delivering significant speedup, and close-to-optimal performance for commonly used parallel computing workloads.We first quantify the extent of the "Ninja gap", which is the performance gap between naively written C/C++ code that is parallelism unaware (often serial) and best-optimized code on modern multi-/many-core processors. Using a set of representative throughput computing benchmarks, we show that there is an average Ninja gap of 24X (up to 53X) for a recent 6-core Intel® Core™ i7 X980 Westmere CPU, and that this gap if left unaddressed will inevitably increase. We show how a set of well-known algorithmic changes coupled with advancements in modern compiler technology can bring down the Ninja gap to an average of just 1.3X. These changes typically require low programming effort, as compared to the very high effort in producing Ninja code. We also discuss hardware support for programmability that can reduce the impact of these changes and even further increase programmer productivity. We show equally encouraging results for the upcoming Intel® Many Integrated Core architecture (Intel® MIC) which has more cores and wider SIMD. We thus demonstrate that we can contain the otherwise uncontrolled growth of the Ninja gap and offer a more stable and predictable performance growth over future architectures, offering strong evidence that radical language changes are not required.
机译:当前的处理器趋势是将更多的内核与更宽的SIMD单元集成在一起,以及更深,更复杂的内存层次结构,这使得从应用程序中提取性能变得越来越具有挑战性。一些人认为,传统的编程方法不适用于这些现代处理器,因此必须发现激进的新语言。在本文中,我们对这种想法提出了质疑,并提供了证据来支持传统的编程方法以及常见的多核处理器和即将到来的多核体系结构在提供显着的加速和接近最佳性能方面的性能对编程工作效率的支持。我们首先量化“忍者差距”的程度,这是天真的编写的C / C ++代码(不知道并行性)(通常是串行的)与现代多/多的最佳优化代码之间的性能差距。核心处理器。通过使用一组具有代表性的吞吐量计算基准,我们发现,最新的6核Intel®Core™i7 X980 Westmere CPU的平均Ninja差距为24倍(最高53倍),如果不加以解决,这种差距将不可避免地增加。我们展示了一组著名的算法更改以及现代编译器技术的进步如何将Ninja差距降低到平均1.3倍。与产生忍者代码的大量工作相比,这些更改通常需要较少的编程工作。我们还将讨论对可编程性的硬件支持,该支持可以减少这些更改的影响,甚至进一步提高程序员的生产率。对于即将推出的具有更多内核和更宽SIMD的英特尔®多核集成架构(Intel®MIC),我们同样显示出令人鼓舞的结果。因此,我们证明了我们可以遏制忍者差距原本无法控制的增长,并在未来的架构上提供更稳定和可预测的性能增长,这提供了强有力的证据表明不需要进行重大的语言更改。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号