...
首页> 外文期刊>Empirical Software Engineering >Software microbenchmarking in the cloud. How bad is it really?
【24h】

Software microbenchmarking in the cloud. How bad is it really?

机译:云中的软件微基准测试。真的有多糟?

获取原文
获取原文并翻译 | 示例
           

摘要

Rigorous performance engineering traditionally assumes measuring on bare-metal environments to control for as many confounding factors as possible. Unfortunately, some researchers and practitioners might not have access, knowledge, or funds to operate dedicated performance-testing hardware, making public clouds an attractive alternative. However, shared public cloud environments are inherently unpredictable in terms of the system performance they provide. In this study, we explore the effects of cloud environments on the variability of performance test results and to what extent slowdowns can still be reliably detected even in a public cloud. We focus on software microbenchmarks as an example of performance tests and execute extensive experiments on three different well-known public cloud services (AWS, GCE, and Azure) using three different cloud instance types per service. We also compare the results to a hosted bare-metal offering from IBM Bluemix. In total, we gathered more than 4.5 million unique microbenchmarking data points from benchmarks written in Java and Go. We find that the variability of results differs substantially between benchmarks and instance types (by a coefficient of variation from 0.03% to 100%). However, executing test and control experiments on the same instances (in randomized order) allows us to detect slowdowns of 10% or less with high confidence, using state-of-the-art statistical tests (i.e., Wilcoxon rank-sum and overlapping bootstrapped confidence intervals). Finally, our results indicate that Wilcoxon rank-sum manages to detect smaller slowdowns in cloud environments.
机译:传统上,严格的性能工程假定在裸机环境下进行测量以控制尽可能多的混杂因素。不幸的是,一些研究人员和从业人员可能没有权限,知识或资金来运行专用的性能测试硬件,从而使公共云成为有吸引力的替代方案。但是,就共享公共云环境提供的系统性能而言,其本质上是不可预测的。在本研究中,我们探索了云环境对性能测试结果的可变性的影响,以及在何种程度上即使在公共云中仍可以可靠地检测到速度下降。我们将软件微基准测试作为性能测试的示例,并针对每种服务使用三种不同的云实例类型对三种不同的知名公共云服务(AWS,GCE和Azure)进行广泛的实验。我们还将结果与IBM Bluemix托管的裸机产品进行比较。总共,我们用Java和Go编写的基准收集了超过450万个唯一的微基准测试数据点。我们发现,基准和实例类型之间结果的变异性存在很大差异(变异系数从0.03%到> 100%)。但是,在同一实例上(以随机顺序)执行测试和控制实验,可以使我们使用最新的统计测试(即Wilcoxon秩和和重叠自举,以高置信度检测到速度降低10%或更少的情况。置信区间)。最后,我们的结果表明,Wilcoxon秩和设法检测出云环境中较小的速度下降。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号