首页> 外文期刊>Concurrency and computation: practice and experience >Performance analysis of the Kahan-enhanced scalar product oncurrent multi-core and many-core processors
【24h】

Performance analysis of the Kahan-enhanced scalar product oncurrent multi-core and many-core processors

机译:Kahan增强标量产品的性能分析目前的多核和许多核心处理器

获取原文
获取原文并翻译 | 示例

摘要

We investigate the performance characteristics of a numerically enhanced scalar product (dot) kernel loopthat uses the Kahan algorithm to compensate for numerical errors, and describe efficient single instructionmultiple data-vectorized implementations on recent multi-core and many-core processors. Using low-levelinstruction analysis and the execution-cache-memory performance model, we pinpoint the relevant performancebottlenecks for single-core and thread-parallel execution and predict performance and saturationbehavior. We show that the Kahan-enhanced scalar product comes at almost no additional cost comparedwith the naive (non-Kahan) scalar product if appropriate low-level optimizations, notably single instructionmultiple data vectorization and unrolling, are applied. The execution-cache-memory model is extendedappropriately to accommodate not only modern Intel multicore chips but also the Intel Xeon Phi ‘KnightsCorner’ coprocessor and an IBM POWER8 CPU. This allows us to discuss the impact of processor featureson the performance across four modern architectures that are relevant for high performance computing.
机译:我们调查数值增强的标量产品(DOT)内核循环的性能特征它使用Kahan算法来补偿数值错误,并描述有效的单个指令近期多核和许多核心处理器的多个数据矢量化实现。使用低级指令分析和执行缓存记忆性能模型,我们确定相关性能单核和线程执行的瓶颈和预测性能和饱和度行为。我们表明Kahan增强的标量产品几乎没有额外的成本使用Naive(非Kahan)标量产品如果适当的低级优化,特别是单一指令应用多个数据矢量化和展开。执行缓存存储模型已扩展适当地适合容纳现代英特尔多核芯片,也适用于英特尔Xeon Phi'keights角落'协处理器和IBM Power8 CPU。这使我们能够讨论处理器功能的影响论四种现代架构的性能与高性能计算相关。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号