Highly scalable parallel sorting

机译：高度可扩展的并行排序

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Sorting is a commonly used process with a wide breadth of applications in the high performance computing field. Early research in parallel processing has provided us with comprehensive analysis and theory for parallel sorting algorithms. However, modern supercomputers have advanced rapidly in size and changed significantly in architecture, forcing new adaptations to these algorithms. To fully utilize the potential of highly parallel machines, tens of thousands of processors are used. Efficiently scaling parallel sorting on machines of this magnitude is inhibited by the communication-intensive problem of migrating large amounts of data between processors. The challenge is to design a highly scalable sorting algorithm that uses minimal communication, maximizes overlap between computation and communication, and uses memory efficiently. This paper presents a scalable extension of the Histogram Sorting method, making fundamental modifications to the original algorithm in order to minimize message contention and exploit overlap. We implement Histogram Sort, Sample Sort, and Radix Sort in Charm++ and compare their performance. The choice of algorithm as well as the importance of the optimizations is validated by performance tests on two predominant modern supercomputer architectures: XT4 at ORNL (Jaguar) and Blue Gene/P at ANL (Intrepid).

机译：排序是在高性能计算领域中具有广泛应用程序的常用过程。并行处理的早期研究为我们提供了有关并行排序算法的综合分析和理论。但是，现代超级计算机的规模迅速发展，并且体系结构发生了重大变化，迫使对这些算法进行新的调整。为了充分利用高度并行机的潜力，使用了数以万计的处理器。在处理器之间迁移大量数据的通信密集型问题阻碍了在这种规模的计算机上有效缩放并行排序的工作。挑战在于设计一种高度可扩展的排序算法，该算法使用最少的通信，最大化计算和通信之间的重叠以及有效使用内存。本文介绍了直方图排序方法的可扩展性扩展，对原始算法进行了基本修改，以最大程度地减少消息争用并利用重叠。我们在Charm ++中实现直方图排序，样本排序和基数排序，并比较它们的性能。算法的选择以及优化的重要性已通过对两种主要的现代超级计算机体系结构的性能测试进行了验证：ORNL的XT4（Jaguar）和ANL的Blue Gene / P（Intrepid）。

著录项

来源
《2010 IEEE International Symposium on Parallel amp; Distributed Processing (IPDPS)》|2010年|P.1-12|共12页
会议地点 Atlanta GA(US);Atlanta GA(US)
作者
Solomonik Edgar; Kale Laxmikant V.;
展开▼
作者单位

Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类 TP311.133;
关键词

相似文献

外文文献
中文文献
专利

1. MODYLAS: A Highly Parallelized General-Purpose Molecular Dynamics Simulation Program for Large-Scale Systems with Long-Range Forces Calculated by Fast Multipole Method (FMM) and Highly Scalable Fine-Grained New Parallel Processing Algorithms [J] . Yoshimichi Andoh, Noriyuki Yoshii, Kazushi Fujimoto Journal of chemical theory and computation: JCTC . 2013,第7期

机译：MODYLAS：具有并行力的大型多用途通用分子动力学仿真程序，该程序由快速多极方法（FMM）和高度可扩展的细粒度新并行处理算法计算而得
2. Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics [J] . Kelly Benjamin J., Fitch James R., Hu Yangqiu, Genome Biology . 2015,第1期

机译：丘吉尔：一种超快速，确定性，高度可扩展和平衡的并行化策略，用于发现临床和人群规模基因组学中的人类遗传变异
3. Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics [J] . Benjamin J Kelly, James R Fitch, Yangqiu Hu, Genome Biology . 2015,第1期

机译：丘吉尔：一种超快速，确定性，高度可扩展和平衡的并行化策略，用于发现临床和人群规模基因组学中的人类遗传变异
4. Highly scalable parallel sorting [C] . Solomonik E., Kale L.V. 2010 IEEE International Symposium on Parallel amp; Distributed Processing (IPDPS) . 2010

机译：高度可扩展的并行排序
5. A near real-time, highly scalable, parallel and distributed adaptive object detection and re-training framework based on the AdaBoost algorithm [D] . Abualkibash, Munther 2015

机译：基于AdaBoost算法的近实时，高度可扩展，并行和分布式的自适应对象检测和再训练框架
6. Churchill: an ultra-fast deterministic highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics [O] . Benjamin J Kelly, James R Fitch, Yangqiu Hu, 2015

机译：丘吉尔：一种超快速确定性高度可扩展且平衡的并行化策略用于发现临床和人群规模基因组学中的人类遗传变异
7. Highly Scalable Parallel Sorting [O] . Edgar Solomonik, Laxmikant V. Kalé 2010

机译：高度可扩展的并行排序

Highly scalable parallel sorting

摘要

著录项

相似文献

相关主题

期刊订阅