首页> 外文会议>24th ACM international conference on supercomputing 2010 >Clustering Performance Data Efficiently at Massive Scales
【24h】

Clustering Performance Data Efficiently at Massive Scales

机译:大规模有效地聚集绩效数据

获取原文
获取原文并翻译 | 示例

摘要

Existing supercomputers have hundreds of thousands of processor cores, and future systems may have hundreds of millions. Developers need detailed performance measurements to tune their applications and to exploit these systems fully. However, extreme scales pose unique challenges for performance-tuning tools, which can generate significant volumes of I/O. Compute-to-I/O ratios have increased drastically as systems have grown, and the I/O systems of large machines can handle the peak load from only a small fraction of cores. Tool developers need efficient techniques to analyze and to reduce performance data from large numbers of cores.rnWe introduce CAPEK, a novel parallel clustering algorithm that enables in-situ analysis of performance data at run time. Our algorithm scales sub-linearly to 131,072 processes, running in less than one second even at that scale, which is fast enough for on-line use in production runs. The CAPEK implementation is fully generic and can be used for many types of analysis. We demonstrate its application to statistical trace sampling. Specifically, we use our algorithm to compute efficiently stratified sampling strategies for traces at run time. We show that such stratification can result in data-volume reduction of up to four orders of magnitude on current large-scale systems, with potential for greater reductions for future extreme-scale systems.
机译:现有的超级计算机具有数十万个处理器核心,而未来的系统可能具有数亿个。开发人员需要详细的性能度量以调整其应用程序并充分利用这些系统。但是,极高的规模对性能调节工具提出了独特的挑战,性能调节工具会产生大量的I / O。随着系统的发展,计算与I / O的比率急剧增加,大型计算机的I / O系统只能处理一小部分内核的峰值负载。工具开发人员需要高效的技术来分析和减少来自大量内核的性能数据。rn我们引入了CAPEK,这是一种新颖的并行聚类算法,可在运行时对性能数据进行原位分析。我们的算法可亚线性扩展至131,072个流程,即使在该比例下也可以在不到一秒钟的时间内运行,这对于生产运行中的在线使用而言足够快。 CAPEK实现是完全通用的,可用于许多类型的分析。我们展示了其在统计跟踪采样中的应用。具体来说,我们使用我们的算法为运行时的痕迹有效地计算分层采样策略。我们表明,这种分层可导致当前大型系统的数据量减少多达四个数量级,并且有可能在未来的极端规模系统中进一步减少。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号