首页> 外文会议>2017 IEEE International Symposium on Performance Analysis of Systems and Software >Analyzing the scalability of managed language applications with speedup stacks
【24h】

Analyzing the scalability of managed language applications with speedup stacks

机译:使用加速堆栈分析托管语言应用程序的可伸缩性

获取原文
获取原文并翻译 | 示例

摘要

Understanding the reasons why multi-threaded applications do not achieve perfect scaling on modern multicore hardware is challenging. Furthermore, more and more modern programs are written in managed languages, which have extra service threads (e.g., to perform memory management), which may retard scalability and complicate performance analysis. In this paper, we extend speedup stacks, a previously-presented visualization tool to analyze multi-threaded program scalability, to managed applications. Speedup stacks are comprehensive bar graphs that break down an application's execution to explain the main causes of sublinear speedup, i.e., when some threads are not allowing the application to progress, and thus increasing the execution time. We not only expand speedup stacks to analyze how the managed language's service threads affect overall scalability, but also implement speedup stacks while running on native hardware. We monitor the application and service threads' scheduling behavior using light-weight OS kernel modules, incurring under 1% overhead running unmodified Java benchmarks. We add two performance delimiters targeting managed applications: garbage collection and main initialization activities. We analyze the scalability limitations of these benchmarks and the impact of using both a stop-the-world and a concurrent garbage collector with speedup stacks. Our visualization tool facilitates the identification of scalability bottlenecks both between application threads and of service threads, pointing developers to whether optimization should be focused on the language runtime or the application. Speedup stacks provide better program understanding for both program and system designers, which can help optimize multicore processor performance.
机译:了解为什么多线程应用程序无法在现代多核硬件上实现完美扩展的原因具有挑战性。此外,越来越多的现代程序是用托管语言编写的,它们具有额外的服务线程(例如,执行内存管理),这可能会延缓可伸缩性并使性能分析复杂化。在本文中,我们将加速堆栈(以前介绍的可视化工具,用于分析多线程程序的可伸缩性)扩展到托管应用程序。加速堆栈是全面的条形图,可分解应用程序的执行以解释次线性加速的主要原因,即某些线程不允许应用程序前进,从而增加了执行时间。我们不仅可以扩展加速堆栈来分析托管语言的服务线程如何影响总体可伸缩性,还可以在本机硬件上运行时实现加速堆栈。我们使用轻量级的OS内核模块监视应用程序和服务线程的调度行为,在运行未修改的Java基准测试时产生的开销不到1%。我们添加了两个针对托管应用程序的性能定界符:垃圾回收和主要初始化活动。我们分析了这些基准的可伸缩性限制,以及同时使用Stop-the-world和并发垃圾收集器以及加速堆栈的影响。我们的可视化工具有助于识别应用程序线程之间和服务线程之间的可伸缩性瓶颈,使开发人员可以将优化重点放在语言运行时还是应用程序上。加速堆栈为程序和系统设计人员提供了更好的程序理解,可以帮助优化多核处理器性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号