首页> 外文会议>IEEE/ACM International Symposium on Code Generation and Optimization >Optimizing function placement for large-scale data-center applications
【24h】

Optimizing function placement for large-scale data-center applications

机译:优化大型数据中心应用程序的功能位置

获取原文

摘要

Modern data-center applications often comprise a large amount of code, with substantial working sets, making them good candidates for code-layout optimizations. Although recent work has evaluated the impact of profile-guided intramodule optimizations and some cross-module optimizations, no recent study has evaluated the benefit of function placement for such large-scale applications. In this paper, we study the impact of function placement in the context of a simple tool we created that uses sample-based profiling data. By using sample-based profiling, this methodology follows the same principle behind AutoFDO, i.e. using profiling data collected from unmodified binaries running in production, which makes it applicable to large-scale binaries. Using this tool, we first evaluate the impact of the traditional Pettis-Hansen (PH) function-placement algorithm on a set of widely deployed data-center applications. Our experiments show that using the PH algorithm improves the performance of the studied applications by an average of 2.6%. In addition to that, this paper also evaluates the impact of two improvements on top of the PH technique. The first improvement is a new algorithm, called C3, which addresses a fundamental weakness we identified in the PH algorithm. We not only qualitatively illustrate how C3 overcomes this weakness in PH, but also present experimental results confirming that C3 performs better than PH in practice, boosting the performance of our workloads by an average of 2.9% on top of PH. The second improvement we evaluate is the selective use of huge pages. Our evaluation shows that, although aggressively mapping the entire code section of a large binary onto huge pages can be detrimental to performance, judiciously using huge pages can further improve performance of our applications by 2.0% on average.
机译:现代数据中心的应用程序通常包含大量的代码,有大量的工作组,使他们的代码,布局优化,很好的候选人。虽然最近的工作评估档案导引在模块内部优化和一些跨模块优化的影响,没有最近的一项研究评估功能布局的好处这样的大规模应用。在本文中,我们学习功能放置在我们创建了一个基于样本用途分析数据的简单工具的背景下产生的影响。通过使用基于样本的分析,该方法如下后面AutoFDO相同的原理,即利用从二进制未修改运行在生产收集分析数据,这使得它适用于大规模的二进制文件。使用这个工具,我们首先评估了传统佩蒂斯 - 汉森(PH)功能布局算法的一组广泛部署的数据中心应用的影响。我们的实验证明,使用PH算法提高了研究应用的平均2.6%的表现。除此之外,本文还评价了对PH技术的前两名改进的影响。第一个改进是一种新的算法,称为C3,它解决我们在PH算法识别一个根本弱点。我们不但定性说明C3如何克服PH这个弱点,而且目前的实验结果证实了C3进行比PH更好地在实践中,以平均2.9%的PH值之上提高我们的工作负载的性能。我们评估的第二个改进是选择性地使用大内存页。我们的评估显示,尽管积极映射大型二进制的整个代码段到巨大的页面可能会损害性能,明智地使用大页面可以通过在平均2.0%进一步提高我们的应用程序的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号