首页> 外文会议>Simulation Multi-Conference >Performance Analysis and Design of a Hessenberg Reduction using Stabilized Blocked Elementary Transformations for New Architectures
【24h】

Performance Analysis and Design of a Hessenberg Reduction using Stabilized Blocked Elementary Transformations for New Architectures

机译:使用稳定的新架构稳定封闭基本转换的Hessenberg减少性能分析与设计

获取原文

摘要

The solution of nonsymmetric eigenvalue problems, Ax = λx, can be accelerated substantially by first reducing A to an upper Hessenberg matrix H that has the same eigenvalues as A. This can be done using Householder orthogonal transformations, which is a well established standard, or stabilized elementary transformations. The latter approach, although having half the flops of the former, has been used less in practice, e.g., on computer architectures with well developed hierarchical memories, because of its memory-bound operations and the complexity in stabilizing it. In this paper we revisit the stabilized elementary transformations approach in the context of new architectures - both multicore CPUs and Xeon Phi coprocessors. We derive for a first time a blocking version of the algorithm. The blocked version reduces the memory-bound operations and we analyze its performance. A performance model is developed that shows the limitations of both approaches. The competitiveness of using stabilized elementary transformations has been quantified, highlighting that it can be 20 to 30% faster on current high-end multicore CPUs and Xeon Phi coprocessors.
机译:非对称特征值问题的溶液轴=λx可以基本上通过首先将A减少到具有与A相同的特征值的上荷康蛋白H基本上加速。这可以使用家庭分子正交变换来完成,这是建立的标准,或者稳定的基本转化。后一种方法虽然具有前者的一半的拖鞋,但在实践中被使用,例如,在具有良好的等级记忆的计算机架构上,因为其内存绑定的操作和稳定它的复杂性。在本文中,我们在新架构的背景下重新审视了稳定的基本转换方法 - 包括多核CPU和Xeon Phi协处理器。我们首次推出算法的阻止版本。被阻止的版本减少了内存绑定的操作,我们分析了其性能。开发了一种表现出两种方法的局限性的性能模型。使用稳定的基本转型的竞争力已经量化,突出显示目前高端多核CPU和Xeon Phi协处理器的速度越快。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号