Exploring the performance limits of simultaneous multithreading for memory intensive applications

Evangelia Athanasaki; Nikos Anastopoulos; Kornilios Kourtis; Nectarios Koziris

首页> 外文期刊>Journal of supercomputing >Exploring the performance limits of simultaneous multithreading for memory intensive applications

【24h】

Exploring the performance limits of simultaneous multithreading for memory intensive applications

机译：探索内存密集型应用程序同时执行多线程的性能限制

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Simultaneous multithreading (SMT) has been proposed to improve system throughput by overlapping instructions from multiple threads on a single wide-issue processor. Recent studies have demonstrated that diversity of simultaneously executed applications can bring up significant performance gains due to SMT. However, the speedup of a single application that is parallelized into multiple threads, is often sensitive to its inherent instruction level parallelism (ILP), as well as the efficiency of synchronization and communication mechanisms between its separate, but possibly dependent threads. Moreover, as these separate threads tend to put pressure on the same architectural resources, no significant speedup can be observed. In this paper, we evaluate and contrast thread-level parallelism (TLP) and speculative precomputation (SPR) techniques for a series of memory intensive codes executed on a specific SMT processor implementation. We explore the performance limits by evaluating the tradeoffs between ILP and TLP for various kinds of instruction streams. By obtaining knowledge on how such streams interact when executed simultaneously on the processor, and quantifying their presence within each application's threads, we try to interpret the observed performance for each application when parallelized according to the aforementioned techniques. In order to amplify this evaluation process, we also present results gathered from the performance monitoring hardware of the processor.

机译：已经提出了同时多线程（SMT）来通过重叠来自单个宽发行处理器上的多个线程的指令来提高系统吞吐量。最近的研究表明，由于SMT，同时执行的应用程序的多样性可以显着提高性能。但是，并行化为多个线程的单个应用程序的速度通常对其固有的指令级并行性（ILP）以及其单独但可能相关的线程之间的同步和通信机制的效率很敏感。此外，由于这些单独的线程倾向于对相同的体系结构资源施加压力，因此无法观察到明显的加速。在本文中，我们针对在特定SMT处理器实现上执行的一系列内存密集型代码，评估并对比了线程级并行（TLP）和推测性预计算（SPR）技术。我们通过评估各种指令流的ILP和TLP之间的权衡来探索性能极限。通过获取有关这些流在处理器上同时执行时如何交互的知识，并量化它们在每个应用程序线程中的存在，我们尝试根据上述技术并行化解释每个应用程序观察到的性能。为了扩大此评估过程，我们还介绍了从处理器性能监视硬件收集的结果。

著录项

来源
《Journal of supercomputing》 |2008年第1期|p.64-97|共34页
作者
Evangelia Athanasaki; Nikos Anastopoulos; Kornilios Kourtis; Nectarios Koziris;
展开▼
作者单位

School of Electrical and Computer Engineering, Computing Systems Laboratory, National Technical University of Athens, Zografou Campus, Zografou 15773, Greece;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
simultaneous multithreading; thread-level parallelism; instruction-level parallelism; software prefetching; speculative precomputation; performance analysis;

机译：同步多线程;线程级并行;指令级并行;软件预取;推测性预计算;性能分析;

相似文献

外文文献
中文文献
专利

1. PHash: A memory-efficient, high-performance key-value store for large-scale data-intensive applications [J] . Hyotaek Shim The Journal of Systems and Software . 2017,第jana期

机译：PHash：一种内存有效的高性能键值存储，适用于大规模数据密集型应用
2. Exploring the Performance Limit of Carbon Nanotube Network Film Field-Effect Transistors for Digital Integrated Circuit Applications [J] . Zhao Chenyi, Zhong Donglai, Han Jie, Advanced Functional Materials . 2019,第16期

机译：探索数字集成电路应用中的碳纳米管网络薄膜场效应晶体管的性能极限
3. Simultaneous determination of 18 D-amino acids in rat plasma by an ultrahigh-performance liquid chromatography-tandem mass spectrometry method: application to explore the potential relationship between Alzheimer's disease and D-amino acid level alterations [J] . Xing Yuping, Li Xiaoyan, Guo Xingjie, Analytical and bioanalytical chemistry . 2016,第1期

机译：超高效液相色谱-串联质谱法同时测定大鼠血浆中的18种D-氨基酸：用于探讨阿尔茨海默氏病与D-氨基酸水平改变之间的潜在关系
4. Exploring the Performance Limits of Simultaneous Multithreading for Scientific Codes [C] . Evangelia Athanasaki, Nikos Anastopoulos, Kornilios Kourtis, . 2006

机译：探索科学代码的同时多线程性能极限
5. Cache performance prediction for memory-intensive applications. [D] . Zeng, YuJuan (Annie). 2005

机译：内存密集型应用程序的缓存性能预测。
6. Exploring intensive care nurses’ team performance in a simulation-based emergency situation − expert raters’ assessments versus self-assessments: an explorative study [O] . Randi Ballangrud, Mona Persenius, Birgitta Hedelin, 2014

机译：探索重症监护护士在基于模拟的紧急情况下的团队绩效-专家评分者的评估与自我评估：一项探索性研究
7. Exploring the Performance Limits of Simultaneous Multithreading for Scientific Codes ∗ [O] . Evangelia Athanasaki, Nikos Anastopoulos, Kornilios Kourtis, 2008

机译：探索科学代码的同时多线程性能极限

Exploring the performance limits of simultaneous multithreading for memory intensive applications

摘要

著录项

相似文献

相关主题

期刊订阅