The Performance Optimization of Threaded Prefetching for Linked Data Structures

Yan Huang; Jie Tang; Zhi-min Gu; Min Cai; Jianxun Zhang; Ninghan Zheng

首页> 外文期刊>International journal of parallel programming >The Performance Optimization of Threaded Prefetching for Linked Data Structures

【24h】

The Performance Optimization of Threaded Prefetching for Linked Data Structures

机译：链接数据结构的线程预取性能优化

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Helper threaded prefetching based on Chip Multiprocessor is a well known approach to reducing memory latency and has been explored in linked data structures accesses. However, conventional helper threaded prefetching often suffers from useless prefetches and cache thrashing, which affect its effectiveness. In this paper, we first analyzed the shortcomings of conventional helper threaded prefetching for linked data structures. Then we proposed an improved helper threaded prefetching, Skip Helper Threaded Prefetching, for hotspots with two level data traversals. Our solution is to profile the applications and balance delinquent loads between main thread and prefetching thread based on the characteristic of operations in their hotspots. Evaluations show that the proposed solution improves average performance by 8.9% (-O2) and 8.5% (-O3) over the conventional helper threaded prefetching that greedily prefetches all delinquent loads. We also compare our proposal with the active threaded prefetching which synchronizes with main thread by semaphore, and find that our proposal provides better performance for the targeted applications.

机译：基于芯片多处理器的Helper线程预取是一种减少内存等待时间的众所周知的方法，并且已在链接数据结构访问中进行了探索。但是，常规的辅助线程预取通常会遭受无用的预取和缓存颠簸，这会影响其有效性。在本文中，我们首先分析了传统的辅助线程预取用于链接数据结构的缺点。然后，我们针对具有两级数据遍历的热点，提出了一种改进的辅助线程预取，跳过辅助线程预取。我们的解决方案是根据应用程序热点中的操作特征来分析应用程序并平衡主线程和预取线程之间的拖延负载。评估显示，与贪婪地预取所有拖欠负载的常规辅助线程预取相比，所提出的解决方案将平均性能提高了8.9％（-O2）和8.5％（-O3）。我们还将提案与活动的线程预取（通过信号量与主线程同步）进行比较，发现我们的提案为目标应用程序提供了更好的性能。

著录项

来源
《International journal of parallel programming》 |2012年第2期|p.141-163|共23页
作者
Yan Huang; Jie Tang; Zhi-min Gu; Min Cai; Jianxun Zhang; Ninghan Zheng;
展开▼
作者单位

School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China,Software Engineering College, Zhengzhou University of Light Industry, Zhengzhou, Henan, China;

School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China;

School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China;

School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China;

School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China;

School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
chip multiprocessor (CMP); prefetching thread; delinquent load; performance analysis; hotspot profiling;

机译：芯片多处理器（CMP）;预取线程;拖欠负荷;绩效分析;热点分析;

相似文献

外文文献
中文文献
专利

1. Estimating Effective Prefetch Distance in Threaded Prefetching for Linked Data Structures [J] . Yan Huang, Zhi-Min Gu, Jie Tang, International journal of parallel programming . 2012,第5期

机译：估计链接数据结构的线程预取中的有效预取距离
2. A General Framework for Prefetch Scheduling in Linked Data Structures and Its Application to Multi-chain Prefetching [J] . SEUNGRYUL CHOI, NICHOLAS KOHOUT, SUMIT PAMNANI, ACM transactions on computer systems . 2004,第2期

机译：链接数据结构中预取调度的通用框架及其在多链预取中的应用
3. Software prefetching using jump pointers in linked data structures [J] . ARUSHI ARORA, SWATI PRIYA, AKHIL KHARE Oriental journal of computer science and technology . 2010,第1期

机译：在链接的数据结构中使用跳转指针进行软件预取
4. Performance Analysis of Prefetching Thread for Linked Data Structure in CMPs [C] . Huang Yan, Gu Zhimin International Conference on Computational Intelligence and Software Engineering;CiSE 2009 . 2009

机译：CMP中链接数据结构的预取线程的性能分析
5. Accurate, timely data prefetching for regular stream, linked data structure, and correlated miss pattern [D] . Liu, Gang 2010

机译：准确，及时地预取常规数据流，链接的数据结构以及相关的未命中模式
6. Linking Palliative Care and Oncology Practice: Performance Status As a Common Thread [O] . Thomas W. LeBlanc, Anthony L. Back 2011

机译：将姑息治疗和肿瘤学实践联系起来：表现状态为共同点
7. Techniques for Bandwidth-Efficient Prefetching of Linked Data Structures in Hybrid Prefetching Systems [O] . Eiman Ebrahimi, Onur Mutlu, Yale N. Patt 2008

机译：混合预取系统中链接数据结构的带宽有效预取技术

The Performance Optimization of Threaded Prefetching for Linked Data Structures

摘要

著录项

相似文献

相关主题

期刊订阅