首页> 外文期刊>Performance Evaluation >Analyzing the distribution fit for storage workload and Internet traffic traces
【24h】

Analyzing the distribution fit for storage workload and Internet traffic traces

机译:分析储存工作量和互联网流量迹线的分配

获取原文
获取原文并翻译 | 示例
           

摘要

Understanding workloads and modeling their performance is important for optimizing systems and services. A useful first step towards understanding the characteristics of workloads is to analyze their inter-arrival times and service requirements. If these characteristics are found to follow certain probability distributions, then corresponding stochastic models can be employed to efficiently estimate the performance of workloads. Such approaches have been explored in specific domains using an assortment of distribu-tions, including the Normal, Weibull, and Exponential. Our primary goal in this work is to understand and model storage workload performance. However, our analysis and others & rsquo; past attempts revealed that none of the commonly-employed distributions provided a good fit for storage workloads. We analyzed over 250 traces across 5 different workload families using 20 widely used distributions, including ones seldom used for storage modeling. We found that the Hyper-exponential distribution with just two phases (H-2) was superior in modeling the storage traces compared to other distributions under five diverse metrics of accuracy, including metrics that assess the risk of over-fitting. Based on these results, we developed a Markov-chain-based stochastic model that accurately estimates the storage system performance across several workload traces. To assess the applicability of the Hyper-exponential for distribution fitting beyond storage traces, we evaluated distribution fitting for Internet traffic traces using over 1,600 traces from 3 different sources. We again found that the Hyper-exponential distribution provided a superior fit compared to other probability distributions. To highlight the applicability of our model, we conducted what-if analyses to investigate (i) the storage performance impact of workload variability and garbage collection under various scenarios and (ii) the impact on service response time of Internet flash crowds. (C) 2020 Elsevier B.V. All rights reserved.
机译:了解工作负载和建模性能对于优化系统和服务非常重要。了解工作负载特征的一个有用的第一步是分析他们的到达间时间和服务要求。如果发现这些特性遵循某些概率分布,则可以采用相应的随机模型来有效地估计工作负载的性能。使用各种分配器,包括正常,Weibull和指数,在特定域中已经探讨了这种方法。我们在这项工作中的主要目标是了解和模拟存储工作负载性能。但是,我们的分析和其他’过去的尝试透露,普通实用的分布都没有提供良好的存储工作负载。我们使用20个广泛使用的分布分析了超过5种不同的工作负载系列的250个迹线,包括很少用于存储建模的人。我们发现,只有两个阶段(H-2)的超级指数分布在模拟存储迹线与其他五种不同精度的不同度量标准的分布相比,包括评估过度拟合风险的指标。基于这些结果,我们开发了一种基于马尔可夫链的随机模型,可准确估计多个工作负载迹线的存储系统性能。为了评估超级指数用于分配拟合超出存储迹线的应用程序,我们评估了来自3个不同源的超过1,600个迹线的互联网流量迹线的分布拟合。我们再次发现,与其他概率分布相比,超指数分布提供了优异的拟合。为了突出我们模型的适用性,我们进行了在各种情况下调查(i)工作负载变异性和垃圾收集的存储性能影响的内容 - (ii)对互联网闪存人群的服务响应时间的影响。 (c)2020 Elsevier B.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号