Performance of Windows Multicore Systems on Threading and MPI

机译：Windows多核系统在线程和MPI上的性能

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

We present performance results on a Windows cluster with up to 768 cores using MPI and two variants of threading ȁ3; CCR and TPL. CCR (Concurrency and Coordination Runtime) presents a message based interface while TPL (Task Parallel Library) allows for loops to be automatically parallelized. MPI is used between the cluster nodes (up to 32) and either threading or MPI for parallelism on the 24 cores of each node. We use a simple matrix multiplication kernel as well as a significant bioinformatics gene clustering application. We find that the two threading models offer similar performance with MPI outperforming both at low levels of parallelism but threading much better when the grain size (problem size per process) is small. We find better performance on Intel compared to AMD on comparable 24 core systems. We develop simple models for the performance of the clustering code.

机译：我们使用MPI和线程two3的两个变体在具有多达768个内核的Windows群集上显示了性能结果。 CCR和TPL。 CCR（并发和协调运行时）表示基于消息的界面，而TPL（任务并行库）允许循环自动并行化。在群集节点（最多32个）之间使用MPI，并且在每个节点的24个内核上使用线程或MPI进行并行处理。我们使用简单的矩阵乘法内核以及重要的生物信息学基因聚类应用程序。我们发现，这两种线程模型在低并行度的情况下都具有与MPI相似的性能，但在粒度（每个进程的问题大小）较小时，线程性能要好得多。与类似的24核系统相比，我们发现Intel的性能优于AMD。我们为集群代码的性能开发了简单的模型。

著录项

来源
《10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing》|2010年|p.814-819|共6页
会议地点 Melbourne(AU);Melbourne(AU)
作者
Qiu Judy; Beason Scott; Bae Seung-Hee; Ekanayake Saliya; Fox Geoffrey;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
MPI; Multicore; Performance; Threading; and Windows;

机译：MPI;多核;性能;线程;和Windows;

相似文献

外文文献
中文文献
专利

1. Performance of windows multicore systems on threading and MPI [J] . Judy Qiu, Seung-Hee Bae Concurrency and Computation . 2012,第1期

机译：Windows多核系统在线程和MPI上的性能
2. Thread-Aware Adaptive Prefetcher on Multicore Systems: Improving the Performance for Multithreaded Workloads [J] . Liu Peng, Yu Jiyang, Huang Michael C. ACM Transactions on Architecture and Code Optimization . 2016,第1期

机译：多核系统上的线程感知自适应预取器：提高多线程工作负载的性能
3. Parallel Performance of MPI Sorting Algorithms on Dual-Core Processor Windows-Based Systems [J] . Alaa Ismail El-Nashar International Journal of Distributed and Parallel Systems . 2011,第3期

机译：MPI排序算法在双核处理器Windows系统上的并行性能
4. Performance of Windows Multicore Systems on Threading and MPI [C] . Judy Qiu, Scott Beason, Seung-Hee Bae, IEEE/ACM International Conference on Cluster, Cloud and Grid Computing . 2010

机译：线程和MPI上的Windows多核系统的性能
5. Transforming Interrupts to Prioritized Schedulable Threads in Multicore Systems. [D] . Baez Hidalgo, Andres Esteban. 2010

机译：在多核系统中将中断转换为优先级可调度线程。
6. A High Performance Load Balance Strategy for Real-Time Multicore Systems [O] . Keng-Mao Cho, Chun-Wei Tsai, Yi-Shiuan Chiu, -1

机译：实时多核系统的高性能负载平衡策略
7. Performance of Windows Multicore Systems on Threading and MPI [O] . 2015

机译：Windows多核系统在线程和mpI上的性能

Performance of Windows Multicore Systems on Threading and MPI

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅