首页> 外文期刊>International journal of parallel programming >The Performance Optimization of Threaded Prefetching for Linked Data Structures
【24h】

The Performance Optimization of Threaded Prefetching for Linked Data Structures

机译:链接数据结构的线程预取性能优化

获取原文
获取原文并翻译 | 示例
           

摘要

Helper threaded prefetching based on Chip Multiprocessor is a well known approach to reducing memory latency and has been explored in linked data structures accesses. However, conventional helper threaded prefetching often suffers from useless prefetches and cache thrashing, which affect its effectiveness. In this paper, we first analyzed the shortcomings of conventional helper threaded prefetching for linked data structures. Then we proposed an improved helper threaded prefetching, Skip Helper Threaded Prefetching, for hotspots with two level data traversals. Our solution is to profile the applications and balance delinquent loads between main thread and prefetching thread based on the characteristic of operations in their hotspots. Evaluations show that the proposed solution improves average performance by 8.9% (-O2) and 8.5% (-O3) over the conventional helper threaded prefetching that greedily prefetches all delinquent loads. We also compare our proposal with the active threaded prefetching which synchronizes with main thread by semaphore, and find that our proposal provides better performance for the targeted applications.
机译:基于芯片多处理器的Helper线程预取是一种减少内存等待时间的众所周知的方法,并且已在链接数据结构访问中进行了探索。但是,常规的辅助线程预取通常会遭受无用的预取和缓存颠簸,这会影响其有效性。在本文中,我们首先分析了传统的辅助线程预取用于链接数据结构的缺点。然后,我们针对具有两级数据遍历的热点,提出了一种改进的辅助线程预取,跳过辅助线程预取。我们的解决方案是根据应用程序热点中的操作特征来分析应用程序并平衡主线程和预取线程之间的拖延负载。评估显示,与贪婪地预取所有拖欠负载的常规辅助线程预取相比,所提出的解决方案将平均性能提高了8.9%(-O2)和8.5%(-O3)。我们还将提案与活动的线程预取(通过信号量与主线程同步)进行比较,发现我们的提案为目标应用程序提供了更好的性能。

著录项

  • 来源
    《International journal of parallel programming》 |2012年第2期|p.141-163|共23页
  • 作者单位

    School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China,Software Engineering College, Zhengzhou University of Light Industry, Zhengzhou, Henan, China;

    School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China;

    School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China;

    School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China;

    School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China;

    School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    chip multiprocessor (CMP); prefetching thread; delinquent load; performance analysis; hotspot profiling;

    机译:芯片多处理器(CMP);预取线程;拖欠负荷;绩效分析;热点分析;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号