【24h】

Energy Characterization and Optimization of Parallel Prefix-Sums Kernels

机译:前缀和和核的能量表征和优化

获取原文

摘要

Prefix-sums appear frequently in numerous computational tasks, and many performance efficient parallel prefix-sums algorithms have been introduced for shared and distributed memory architectures. However, as far as we know, the energy consumption behavior of these algorithms is unknown, as well as the energy-performance trade-offs. This paper is a first attempt to address the energy aspects of CPPS (cache-aware parallel prefix-sums), a high performance parallel prefix-sums kernel specific for x86 shared memory architectures. We provide implementation details for CPPS and various sequential prefix-sums algorithms that are used as building blocks. We measure performance and energy consumption of CPPS with different configurations (sequential prefix-sums kernel, CPU frequency, number of threads and thread placement policy). The results show significant energy savings, from 24 % to 55%, when configuring CPPS with an optimized rather than a non-optimized sequential prefix-sums kernel for various different CPU frequency levels and number of threads.
机译:前缀和在很多计算任务中经常出现,并且为共享和分布式内存体系结构引入了许多性能高效的并行前缀和算法。但是,据我们所知,这些算法的能耗行为以及能量性能的权衡都是未知的。本文是针对CPPS(高速缓存感知的并行前缀和)的能量方面的首次尝试,CPPS是专用于x86共享内存体系结构的高性能并行前缀和内核。我们提供了CPPS的实现详细信息以及用作构建块的各种顺序前缀​​和算法。我们使用不同的配置(顺序前缀和内核,CPU频率,线程数和线程放置策略)来测量CPPS的性能和能耗。结果表明,针对各种不同的CPU频率级别和线程数,使用经过优化而不是未经优化的顺序前缀和内核配置CPPS时,可以节省24%到55%的能源。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号