首页> 外文期刊>Parallel Computing >Trace profiling: Scalable event tracing on high-end parallel systems
【24h】

Trace profiling: Scalable event tracing on high-end parallel systems

机译:跟踪分析:高端并行系统上的可扩展事件跟踪

获取原文
获取原文并翻译 | 示例
           

摘要

Accurate performance analysis of high end systems requires event-based traces to correctly identify the root cause of a number of the complex performance problems that arise on these highly parallel systems. These high-end architectures contain tens to hundreds of thousands of processors, pushing application scalability challenges to new heights. Unfortunately, the collection of event-based data presents scalability challenges itself: the large volume of collected data increases tool overhead, and results in data files that are difficult to store and analyze. Our solution to these problems is a new measurement technique called trace profiling that collects the information needed to diagnose performance problems that traditionally require traces, but at a greatly reduced data volume. The trace profiling technique reduces the amount of data stored by capitalizing on the repeated behavior of programs, and on the similarity of the behavior and performance of parallel processes in an application run. Trace profiling is a hybrid between profiling and tracing, collecting summary information about the event patterns in an application run. Because the data has already been classified into behavior categories, we can present reduced, partially analyzed performance data to the user, highlighting the performance behaviors that comprised most of the execution time.
机译:高端系统的准确性能分析需要基于事件的跟踪,以正确识别这些高度并行系统上出现的许多复杂性能问题的根本原因。这些高端架构包含数以万计的处理器,从而将应用程序可扩展性挑战推向新的高度。不幸的是,基于事件的数据的收集本身就带来了可伸缩性的挑战:大量数据的收集增加了工具的开销,并导致难以存储和分析的数据文件。我们针对这些问题的解决方案是一种称为跟踪分析的新测量技术,该技术可收集诊断传统上需要跟踪的性能问题所需的信息,但会大大减少数据量。跟踪分析技术通过利用程序的重复行为以及应用程序运行中并行进程的行为和性能的相似性来减少存储的数据量。跟踪概要分析是概要分析和跟踪之间的混合,收集有关应用程序运行中事件模式的摘要信息。由于数据已经被分类为行为类别,因此我们可以向用户提供经过简化的,经过部分分析的性能数据,以突出显示包含大部分执行时间的性能行为。

著录项

  • 来源
    《Parallel Computing》 |2012年第5期|p.194-225|共32页
  • 作者单位

    Portland State University, Computer Science Department, P.O. Box 751, Portland, OR 97207-075}, United States Lawrence Livermore National Laboratory, Box 808, L-557, Livermore, CA 94551-0808, United States;

    Portland State University, Computer Science Department, P.O. Box 751, Portland, OR 97207-075}, United States;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    performance measurement; event tracing; parallel performance tools;

    机译:绩效评估;事件跟踪;并行性能工具;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号