首页> 外文会议>IEEE International Symposium on Parallel and Distributed Processing with Applications >L2 Cache Performance Analysis and Optimizations for Processing HDF5 Data on Multi-core Nodes
【24h】

L2 Cache Performance Analysis and Optimizations for Processing HDF5 Data on Multi-core Nodes

机译:L2缓存性能分析和用于处理多核节点上的HDF5数据的优化

获取原文

摘要

It is important to design and develop scientific middleware libraries to harness the opportunities presented by emerging multi-core processors that are available on grid and cloud environments. Scientific middleware libraries not adhering or adapting to this programming paradigm can suffer from severe performance limitations while executing on emerging multi-core processors. In this paper, we focus on the utilization of a critical shared resource on chip multiprocessors (CMPs), the L2 cache. The way in which an application schedules and assigns processing work to each thread determines the access pattern of the shared L2 cache, which may result in either enhancing or diminishing the effects of memory latency on a multi-core processor. Therefore, while processing scientific datasets such as HDF5, it is essential to conduct fine-grained analysis of cache utilization, to make informed processing and scheduling decisions in multi-threaded programming. In this paper, using the TAU toolkit for performance feedback from dual- and quad-core machines, we analyze and recommend methods for effective scheduling of threads on multi-core nodes to augment the performance of scientific applications processing HDF5 data. We discuss the benefits that can be achieved by using L2 Cache-Affinity and L2 Balanced-Set based scheduling algorithms for improving L2 cache performance and effectively the overall execution time.
机译:重要的是设计和开发科学的中间件图书馆来利用新兴的多核处理器提供的机会,这些机会在网格和云环境中提供。没有遵守或适应此编程范例的科学中间件库可能会遭受严重的性能限制,同时在新出现的多核处理器上执行。在本文中,我们专注于利用芯片多处理器(CMP),L2缓存的关键共享资源。应用程序调度和将处理工作分配给每个线程的方式确定共享L2高速缓存的访问模式,这可能导致增强或减少多核处理器上的存储器延迟的效果。因此,在处理如HDF5之类的科学数据集时,必须对高速缓存利用进行细粒度分析,以便在多线程编程中进行明智的处理和调度决策。在本文中,使用Tau Toolkit进行双核和四核机器的性能反馈,我们分析和推荐用于有效调度多核节点上线程的方法,以增加科学应用处理HDF5数据的性能。我们讨论了通过使用L2缓存亲和力和基于L2平衡集的调度算法来实现的好处,用于改善L2高速缓存性能,有效地实现整体执行时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号