Optimizing and Scaling HPCG on Tianhe-2: Early Experience

机译：在天河2号上优化和扩展HPCG：早期经验

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper, a first attempt has been made on optimizing and scaling HPCG on the world's largest supercomputer, Tianhe-2. This early work focuses on the optimization of the CPU code without using the Intel Xeon Phi coprocessors. In our work, we reformulate the basic CG algorithm to minimize the cost of collective communication and employ several optimizing techniques such as SIMDization, loop unrolling, forward and backward sweep fusion, OpenMP parallization to further enhance the performance of kernels such as the sparse matrix vector multiplication, the symmetric Gauss-Seidel relaxation and the geometric multigrid v-cycle. We successfully scale the HPCG code from 256 up to 6,144 nodes (147,456 CPU cores) on Tianhe-2, with a nearly ideal weak scalability and an aggregate performance of 79.83 Tflops, which is 6.38X higher than the reference implementation.

机译：本文首次尝试在世界上最大的超级计算机Tianhe-2上优化和缩放HPCG。这项早期工作的重点是在不使用英特尔至强融核协处理器的情况下优化CPU代码。在我们的工作中，我们重新设计了基本的CG算法，以最大程度地降低了集体通信的成本，并采用了几种优化技术，例如SIMD化，循环展开，前向和后向扫频融合，OpenMP并行化，以进一步增强诸如稀疏矩阵向量之类的内核的性能。乘法，对称高斯-塞德尔松弛和几何多重网格v周期。我们在Tianhe-2上成功地将HPCG代码从256个节点扩展到了6,144个节点（147,456个CPU内核），具有近乎理想的弱扩展性和79.83 Tflops的综合性能，比参考实现高出6.38倍。

著录项

来源
《International conference on algorithms and architectures for parallel processing》|2014年|28-41|共14页
会议地点
作者
Xianyi Zhang; Chao Yang; Fangfang Liu; Yiqun Liu; Yutong Lu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. 623 Tflop/s HPCG run on Tianhe-2: Leveraging millions of hybrid cores [J] . Liu Yiqun, Yang Chao, Liu Fangfang, Experimental Mechanics . 2016,第1期

机译：在Tianhe-2上运行623 Tflop / s HPCG：利用数百万个混合内核
2. Parallelizing and optimizing large-scale 3D multi-phase flow simulations on the Tianhe-2 supercomputer [J] . Li Dali, Xu Chuanfu, Wang Yongxian, Concurrency and computation: practice and experience . 2016,第5期

机译：在天河2号超级计算机上并行化和优化大规模3D多相流模拟
3. High-Scalable Collaborated Parallel Framework for Large-Scale Molecular Dynamic Simulation on Tianhe-2 Supercomputer [J] . Peng Shaoliang, Zhang Xiaoyu, Su Wenhe, IEEE/ACM transactions on computational biology and bioinformatics . 2020,第3期

机译：天河2超级计算机上大规模分子动态模拟的高可扩展合作平行框架
4. Optimizing and Scaling HPCG on Tianhe-2: Early Experience [C] . Xianyi Zhang, Chao Yang, Fangfang Liu, ICA3PP 2014 . 2014

机译：Tianhe-2上的优化和缩放HPCG：早期经验
5. Relative Efficacy of the Clinician-Administered PTSD Scale-5, the Dissociative Subtype of PTSD Scale, and the Dissociative Experiences Scales to Identify the PTSD Dissociative Subtype [D] . ?Naish, Brandi L. 2020

机译：临床医生施用的PTSD级-5的相对疗效，PTSD规模的解离亚型，分离经验尺度缩放，以识别PTSD分离亚型
6. Is the Dissociative Experiences Scale able to identify detachment and compartmentalization symptoms? Factor structure of the Dissociative Experiences Scale in a large sample of psychiatric and nonpsychiatric subjects [O] . Eva Mazzotti, Benedetto Farina, Claudio Imperatori, 2016

机译：Dissociative Experiences量表是否能够识别脱离和隔离症状？大量精神病学和非精神病学受试者样本中的离体经验量表的因素结构
7. Performance Optimization of the HPCG Benchmark on the Sunway TaihuLight Supercomputer [O] . Yulong Ao, Chao Yang, Fangfang Liu, 2018

机译：Sunway Tohulight超级计算机上HPCG基准的性能优化

Optimizing and Scaling HPCG on Tianhe-2: Early Experience

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅