【24h】

High performance implementation of tridiagonalization on the SR8000

机译:SR8000上三对角化的高性能实现

获取原文

摘要

The methods of high performance tridiagonalization on the HITACHE SR8000 are described and evaluated. To achieve high performance, we adopted the blocked tridiagonalization and the scattered square decomposition. In addition, to achieve more performance in one node, we took the ways of the rectangular computation in the diagonal blocks and the loop integration for reducing the number of read/write. On one node of the SR8000, we achieved about 4.0 Gflop/s in the 4000-dimemsion tridiagonalization of a real symmetric matrix. This is much better than the 2.9 Gflop/s of our matrix library's on the HITACHI S-3800, which has the same peak performance with one node of the SR8000.
机译:描述并评估了HITACHE SR8000上的高性能三对角线化方法。为了实现高性能,我们采用了封闭的对角线化和分散的正方形分解。另外,为了在一个节点上获得更高的性能,我们采用了对角线块中的矩形计算和循环集成的方式来减少读/写次数。在SR8000的一个节点上,我们在一个实对称矩阵的4000维三对角线化中实现了约4.0 Gflop / s的速度。这比日立S-3800矩阵库的2.9 Gflop / s更好,后者在SR8000的一个节点上具有相同的峰值性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号