Software microbenchmarking in the cloud. How bad is it really?

Laaber Christoph; Scheuner Joel; Leitner Philipp

首页> 外文期刊>Empirical Software Engineering >Software microbenchmarking in the cloud. How bad is it really?

【24h】

Software microbenchmarking in the cloud. How bad is it really?

机译：云中的软件微基准测试。真的有多糟？

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Rigorous performance engineering traditionally assumes measuring on bare-metal environments to control for as many confounding factors as possible. Unfortunately, some researchers and practitioners might not have access, knowledge, or funds to operate dedicated performance-testing hardware, making public clouds an attractive alternative. However, shared public cloud environments are inherently unpredictable in terms of the system performance they provide. In this study, we explore the effects of cloud environments on the variability of performance test results and to what extent slowdowns can still be reliably detected even in a public cloud. We focus on software microbenchmarks as an example of performance tests and execute extensive experiments on three different well-known public cloud services (AWS, GCE, and Azure) using three different cloud instance types per service. We also compare the results to a hosted bare-metal offering from IBM Bluemix. In total, we gathered more than 4.5 million unique microbenchmarking data points from benchmarks written in Java and Go. We find that the variability of results differs substantially between benchmarks and instance types (by a coefficient of variation from 0.03% to 100%). However, executing test and control experiments on the same instances (in randomized order) allows us to detect slowdowns of 10% or less with high confidence, using state-of-the-art statistical tests (i.e., Wilcoxon rank-sum and overlapping bootstrapped confidence intervals). Finally, our results indicate that Wilcoxon rank-sum manages to detect smaller slowdowns in cloud environments.

机译：传统上，严格的性能工程假定在裸机环境下进行测量以控制尽可能多的混杂因素。不幸的是，一些研究人员和从业人员可能没有权限，知识或资金来运行专用的性能测试硬件，从而使公共云成为有吸引力的替代方案。但是，就共享公共云环境提供的系统性能而言，其本质上是不可预测的。在本研究中，我们探索了云环境对性能测试结果的可变性的影响，以及在何种程度上即使在公共云中仍可以可靠地检测到速度下降。我们将软件微基准测试作为性能测试的示例，并针对每种服务使用三种不同的云实例类型对三种不同的知名公共云服务（AWS，GCE和Azure）进行广泛的实验。我们还将结果与IBM Bluemix托管的裸机产品进行比较。总共，我们用Java和Go编写的基准收集了超过450万个唯一的微基准测试数据点。我们发现，基准和实例类型之间结果的变异性存在很大差异（变异系数从0.03％到> 100％）。但是，在同一实例上（以随机顺序）执行测试和控制实验，可以使我们使用最新的统计测试（即Wilcoxon秩和和重叠自举，以高置信度检测到速度降低10％或更少的情况。置信区间）。最后，我们的结果表明，Wilcoxon秩和设法检测出云环境中较小的速度下降。

著录项

来源
《Empirical Software Engineering》 |2019年第4期|2469-2508|共40页
作者
Laaber Christoph; Scheuner Joel; Leitner Philipp;
展开▼
作者单位

Univ Zurich, Dept Informat, Zurich, Switzerland;

Chalmers Univ Gothenburg, Software Engn Div, Gothenburg, Sweden;

Chalmers Univ Gothenburg, Software Engn Div, Gothenburg, Sweden;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Performance testing; Microbenchmarking; Cloud; Performance-regression detection;

机译：性能测试;微稳标;云;性能回归检测;

相似文献

外文文献
中文文献
专利

1. Software microbenchmarking in the cloud. How bad is it really? [J] . Laaber Christoph, Scheuner Joel, Leitner Philipp Empirical Software Engineering . 2019,第4期

机译：云中的软件微磁。它真的有多糟糕？
2. Applying test case prioritization to software microbenchmarks [J] . Laaber Christoph, Gall Harald C., Leitner Philipp Empirical Software Engineering . 2021,第6期

机译：将测试案例优先考虑到软件微稳态
3. Bad Software Architecture is a People Problem: WHEN PEOPLE DON'T WORK WELL TOGETHER THEY MAKE BAD DECISIONS [J] . KATE MATSUDAIRA ACM Queue: Architecting Tomorrow s Computing . 2016,第3期

机译：软件架构坏的是人们的问题：当人们在一起工作时，他们做出了错误的决定
4. Unhappy Developers: Bad for Themselves, Bad for Process, and Bad for Software Product [C] . Daniel Graziotin, Fabian Fagerholm, Xiaofeng Wang, 2017 IEEE/ACM 39th International Conference on Software Engineering Companion . 2017

机译：不满意的开发人员：不利于自身，不利于流程，不利于软件产品
5. Quality improvement of SaaS (Software as a Service) in the Cloud. [D] . Alannsary, Mohammed O. 2016

机译：云中SaaS（软件即服务）的质量提高。
6. Localized superoxide release by neutrophils can be provoked by a cytosolic calcium cloud. [O] . E V Davies, M B Hallett, A K Campbell 1991

机译：中性粒细胞释放的局部超氧化物可以由胞质钙云引起。
7. Software microbenchmarking in the cloud. How bad is it really? [O] . Christoph Laaber, Joel Scheuner, Philipp Leitner 2019

机译：云中的软件微磁。它真的有多糟糕？

Software microbenchmarking in the cloud. How bad is it really?

摘要

著录项

相似文献

相关主题

期刊订阅