Performance evaluation of concurrent collections on high-performance multicore computing systems

机译：高性能多核计算系统上并发集合的性能评估

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper is the first extensive performance study of a recently proposed parallel programming model, called Concurrent Collections (CnC). In CnC, the programmer expresses her computation in terms of application-specific operations, partially-ordered by semantic scheduling constraints. The CnC model is well-suited to expressing asynchronous-parallel algorithms, so we evaluate CnC using two dense linear algebra algorithms in this style for execution on state-of-the-art multicore systems: (i) a recently proposed asynchronous-parallel Cholesky factorization algorithm, (ii) a novel and non-trivial ????higher-level???? partly-asynchronous generalized eigensolver for dense symmetric matrices. Given a well-tuned sequential BLAS, our implementations match or exceed competing multithreaded vendor-tuned codes by up to 2.6????. Our evaluation compares with alternative models, including ScaLAPACK with a shared memory MPI, OpenMP, Cilk++, and PLASMA 2.0, on Intel Harpertown, Nehalem, and AMD Barcelona systems. Looking forward, we identify new opportunities to improve the CnC language and runtime scheduling and execution.

机译：本文是最近提出的并行编程模型的第一个广泛的性能研究，称为并发集合（CNC）。在CNC中，程序员以特定于应用程序的操作表示她的计算，通过语义调度约束部分排序。 CNC模型非常适合表达异步平行算法，因此我们使用这种风格中的两个密集的线性代数算法评估CNC，以执行最先进的多核系统：（i）最近提出的异步平行挑剔分解算法，（ii）一种新颖且非琐碎的????更高级别????偏离对称矩阵的部分异步广义Eigensolver。鉴于良好调整的顺序BLA，我们的实现匹配或超过竞争多线程供应商调谐代码高达2.6 ????我们的评估与替代模型进行了比较，包括缩写内存MPI，OpenMP，Cilk ++和等离子体2.0，在英特尔Harpertown，Nehalem和AMD Barcelona系统上。期待着，我们确定了提高数控语言和运行时调度和执行的新机会。

著录项

来源
《IEEE International Symposium on Parallel Distributed Processing》|2010年||共12页
会议地点
作者
Chandramowlishwaran A.; Knobe K.; Vuduc R.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.138-53;
关键词

相似文献

外文文献
中文文献
专利

1. RAPID for high-performance computing systems: architecture and performance evaluation [J] . Avinash Karanth Kodi, Ahmed Louri Applied Optics . 2006,第25期

机译：高性能计算系统的RAPID：体系结构和性能评估
2. RAPID for high-performance computing systems: architecture and performance evaluation [J] . Avinash Karanth Kodi, Ahmed Louri Applied optics . 2006,第25期

机译：高性能计算系统的RAPID：体系结构和性能评估
3. Modeling and analysis of performances for concurrent multithread applications on multicore and graphics processing unit systems [J] . Cerotti D., Gribaudo M., Iacono M., Concurrency and computation: practice and experience . 2016,第2期

机译：多核和图形处理单元系统上并发多线程应用程序的性能建模和分析
4. Performance evaluation of concurrent collections on high-performance multicore computing systems [C] . Chandramowlishwaran A., Knobe K., Vuduc R. 2010 IEEE International Symposium on Parallel amp; Distributed Processing (IPDPS) . 2010

机译：高性能多核计算系统上并发集合的性能评估
5. High-performance computing algorithms for constructing inverted files on emerging multicore processors. [D] . Wei, Zheng. 2012

机译：用于在新兴的多核处理器上构造反向文件的高性能计算算法。
6. DScan – a high-performance digital scanning system for entomological collections [O] . Stefan Schmidt, Michael Balke, Stefan Lafogler 2012

机译：DScan –一种用于昆虫采集的高性能数字扫描系统
7. Performance Evaluation of Concurrent Collections on High-Performance Multicore Computing Systems [O] . Aparna Ch, Kathleen Knobe, Richard Vuduc 2010

机译：高性能多核计算系统上并发集合的性能评估
8. Evaluating Early High-Performance Computing Systems [R] . El-Ghazawi, Tarek, Ozkaya, Armagan, Meajil, Abdullah 1994

机译：评估早期的高性能计算系统

Performance evaluation of concurrent collections on high-performance multicore computing systems

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅