Increasing data reuse of sparse algebra codes on simultaneous multithreading architectures

J. C. Pichel; D. B. Heras; J. C. Cabaleiro; F. F. Rivera

首页> 外文期刊>Concurrency and Computation >Increasing data reuse of sparse algebra codes on simultaneous multithreading architectures

【24h】

Increasing data reuse of sparse algebra codes on simultaneous multithreading architectures

机译：同时多线程体系结构上稀疏代数代码的数据重用

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper the problem of the locality of sparse algebra codes on simultaneous multithreading (SMT) architectures is studied. In these kind of architectures many hardware structures are dynamically shared among the running threads. This puts a lot of stress on the memory hierarchy, and a poor locality, both inter-thread and intra-thread, may become a major bottleneck in the performance of a code. This behavior is even more pronounced when the code is irregular, which is the case of sparse matrix ones. Therefore, techniques that increase the locality of irregular codes on SMT architectures are important to achieve high performance. This paper proposes a data reordering technique specially tuned for these kind of architectures and codes. It is based on a locality model developed by the authors in previous works. The technique has been tested, first, using a simulator of a SMT architecture, and subsequently, on a real architecture as Intel's Hyper-Threading. Important reductions in the number of cache misses have been achieved, even when the number of running threads grows. When applying the locality improvement technique, we also decrease the total execution time and improve the scalability of the code.

机译：本文研究了同时代多线程（SMT）体系结构中稀疏代数代码的局部性问题。在这些类型的体系结构中，许多硬件结构在运行的线程之间动态共享。这给内存层次结构带来了很大压力，并且线程间和线程内的局域性差可能成为代码性能的主要瓶颈。当代码不规则时，这种行为更加明显，这是稀疏矩阵代码的情况。因此，增加SMT体系结构上不规则代码的局部性的技术对于实现高性能很重要。本文提出了一种针对此类架构和代码进行了特别调整的数据重新排序技术。它基于作者在以前的工作中开发的位置模型。首先使用SMT架构的仿真器对该技术进行了测试，然后使用英特尔的超线程技术在真实的架构上进行了测试。即使正在运行的线程数增加，也可以大大减少高速缓存未命中的次数。当应用局部性改进技术时，我们还减少了总执行时间并提高了代码的可伸缩性。

著录项

来源
《Concurrency and Computation》 |2009年第15期|1838-1856|共19页
作者
J. C. Pichel; D. B. Heras; J. C. Cabaleiro; F. F. Rivera;
展开▼
作者单位

Universidad Carlos III de Madrid, Av. de la Universidad 30, 28911 Leganes, Spain;

Electronics and Computer Science Department, Universidade de Santiago de Compostela, Spain;

Electronics and Computer Science Department, Universidade de Santiago de Compostela, Spain;

Electronics and Computer Science Department, Universidade de Santiago de Compostela, Spain;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
sparse matrix; irregular codes; data reuse; locality; multithreading; sparse algebra codes;

机译：稀疏矩阵不规则代码;数据重用;地方多线程稀疏代数代码;

相似文献

外文文献
中文文献
专利

1. Performance assessment of multithreaded quicksort algorithm on simultaneous multithreaded architecture [J] . Basel A. Mahafzah Journal of supercomputing . 2013,第1期

机译：同时多线程体系结构上的多线程快速排序算法的性能评估
2. Reliability-aware simultaneous multithreaded architecture using online architectural vulnerability factor estimation [J] . Pouyan Fatemeh, Azarpeyvand Ali, Safari Saeed, Computers & Digital Techniques, IET . 2015,第2期

机译：使用在线体系结构脆弱性因素估计的可靠性感知同步多线程体系结构
3. Multithreaded sparse matrix-matrix multiplication for many-core and GPU architectures [J] . Deveci Mehmet, Trott Christian, Rajamanickam Sivasankaran Parallel Computing . 2018,第octa期

机译：适用于多核和GPU架构的多线程稀疏矩阵矩阵乘法
4. Ultra-Sparse Binary LDPC Codes with CSK Signals for Increased Data Rates in Future GNSS [C] . Rémi Chauvat, Axel Garcia Pe?a, Marco Anghileri, ESA Workshop on Satellite NavigationTechnologies and European Workshop on GNSS Signals and Signal Processing . 2019

机译：具有CSK信号的超稀疏二进制LDPC代码，用于增加未来GNSS的数据速率
5. Adaptive dynamic thread scheduling for simultaneous multithreaded architectures with a detector thread. [D] . Shin, Chulho. 2002

机译：具有检测器线程的同时多线程体系结构的自适应动态线程调度。
6. Sparse Coding Models Can Exhibit Decreasing Sparseness while Learning Sparse Codes for Natural Images [O] . Joel Zylberberg, Michael Robert DeWeese 2013

机译：稀疏编码模型可以在学习自然图像的稀疏代码时表现出稀疏性
7. Modeling the hemodynamic response function using simultaneous EEG-fMRI data and convolutional sparse coding analysis with rank-1 constraints [O] . Prokopis C. Prokopiou, Michalis Kassinopoulos, Alba Xifra-Porxas, 2020

机译：使用Rank-1约束使用同步EEG-FMRI数据和卷积稀疏编码分析来建立血液动力学响应函数

Increasing data reuse of sparse algebra codes on simultaneous multithreading architectures

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅