Hiding Latency Through Bulk Transfer and Prefetching in Distributed Shared Memory Multiprocessors

机译：通过批量传输和预取在分布式共享内存多处理器中隐藏延迟

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Distributed shared memory(DSM) machines provide shared memory paradigm and achieve high performance by the caching of shared data. However, they suffer from cache miss and remote access latency with coarse-grain patterns. In this paper, we suggest the combination of bulk transfer and prefetching as a new latency hiding technique in DSM machines. The purpose of bulk transfer is to replicate re-mote data into local memory and thus reduce remote ac-cesses. Adaptive Granularity was used for bulk transfer, Prefetching is added to fetch those replicated data to the cache at the right teme. We could apply a simple prefetch scheduling as in uniprocessor since bulk transfer converts remote access into local ones. Simulation results show the reduced latency and the potential of AG as a preferable architecture for the prefetching in DSM machines.

机译：分布式共享内存（DSM）机器提供共享内存范例，并通过缓存共享数据来实现高性能。但是，它们遭受高速缓存未命中和具有粗粒度模式的远程访问延迟的困扰。在本文中，我们建议将批量传输和预取结合起来，作为DSM计算机中的一种新的延迟隐藏技术。批量传输的目的是将远程数据复制到本地内存中，从而减少远程访问。自适应粒度用于批量传输，添加了预取功能，以将这些复制的数据以适当的时间取到缓存中。我们可以像在单处理器中一样应用简单的预取调度，因为批量传输会将远程访问转换为本地访问。仿真结果表明，减少的等待时间以及AG作为DSM机器中预取的首选体系结构的潜力。

著录项

来源
《International conference/exhibition on high performance computing in the Asia-Pacific region;HPC-Asia'2000》|2000年|p.164-166|共3页
会议地点
作者
Yangwoo Roh; Byeong Hag Seong; Daeyeon Park;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Application of self organizing maps for investigating network latency on a broadcast-based distributed shared memory multiprocessor [J] . M. Fatih Akay, Ipek Abasikeles, Mustafa Oral Expert systems with applications . 2010,第4期

机译：自组织映射在基于广播的分布式共享内存多处理器上调查网络等待时间的应用
2. REDUCING CONTROL LATENCY IN DISTRIBUTED SHARED-MEMORY MULTIPROCESSOR SYSTEMS USING FUZZY LOGIC PREDICTION [J] . O.M. Al-Jarrah, A. Muhsen International Journal Of Modelling & Simulation . 2005,第1期

机译：基于模糊逻辑预测的分布式共享存储器多处理器系统控制时延降低
3. Evaluation of hardware-based stride and sequential prefetching in shared-memory multiprocessors [J] . Dahlgren F., Stenstrom P. IEEE Transactions on Parallel and Distributed Systems . 1996,第4期

机译：评估共享内存多处理器中基于硬件的步幅和顺序预取
4. Hiding latency through bulk transfer and prefetching in distributed shared memory multiprocessors [C] . Yangwoo Roh, Byeong Hag Seong High Performance Computing in the Asia-Pacific Region, 2000. Proceedings. The Fourth International Conference/Exhibition on . 2000

机译：通过批量传输和分布式共享内存多处理器中的预取来隐藏延迟
5. Adaptive and integrated data cache prefetching for shared memory multiprocessors [D] . Gornish, Edward H. 1995

机译：共享内存多处理器的自适应和集成数据高速缓存预取
6. Performance of parallel FDTD method for shared- and distributed-memory architectures: Application tobioelectromagnetics [O] . Miguel Ruiz-Cabello N., Maksims Abaļenkovs, Luis M. Diaz Angulo, 2020

机译：共享和分布式内存架构并行FDTD方法的性能：应用脚踏电磁
7. Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors [O] . Todd Mowry, Anoop Gupta 1991

机译：在共享内存多处理器中通过软件控制的预取来容忍延迟
8. Effectiveness of Caches and Data Prefetch Buffers in Large-Scale Shared Memory Multiprocessors [R] . Lee, R. L. 1987

机译：大规模共享存储器多处理器中高速缓存和数据预取缓冲区的有效性

Hiding Latency Through Bulk Transfer and Prefetching in Distributed Shared Memory Multiprocessors

摘要

著录项

相似文献

相关主题

期刊订阅