首页> 外文会议>IEEE International Conference on Parallel and Distributed Systems >How Does the Workload Look Like in Production Cloud? Analysis and Clustering of Workloads on Alibaba Cluster Trace
【24h】

How Does the Workload Look Like in Production Cloud? Analysis and Clustering of Workloads on Alibaba Cluster Trace

机译:工作负载在生产云中的外观如何?阿里巴巴集群跟踪中的工作负载分析和集群

获取原文

摘要

Cloud computing technology is widely used in today's datacenters due to the benefits such as high scalability, on-demand services and low cost. An in-depth understanding of the characteristics of workloads running in production cloud environments is very important for improving the resource management efficiency. In this paper, we make a detailed analysis with visualization techniques and clustering methods on the trace dataset released by Alibaba which contains 11089 online services and 12951 batch jobs running on 1313 machines. Our methodology for clustering workloads contains: i) Select effective feature vectors as the dimensions of clustering; ii) Identify the cluster boundaries of each dimension using K-Means algorithm; iii) Classify jobs by combining the feature vectors which uses the results from previous step; iv) Analyze the characteristics of workload groups at runtime. Our analysis reveals several insights which previous work has not found on Alibaba cluster trace. For batch jobs: a) Average CPU cores of all batch jobs show bimodal-distribution obviously. b) At a random sampling time, more than 50 % machines only run one group of jobs with a short duration, medium CPU cores and small memory utilization, the remaining machines run mixed groups of jobs. For online instances: a) The resource usage (CPU, Memory, and Disk) of most online instances is low; b) There are up to six groups running on the same machine according to our clustering method at a random sampling time.
机译:由于具有诸如高可伸缩性,按需服务和低成本等优点,云计算技术已广泛应用于当今的数据中心。深入了解生产云环境中运行的工作负载的特征对于提高资源管理效率非常重要。在本文中,我们使用可视化技术和聚类方法对阿里巴巴发布的跟踪数据集进行了详细分析,该数据集包含在1313台计算机上运行的11089个在线服务和12951个批处理作业。我们对工作负载进行聚类的方法包括:i)选择有效的特征向量作为聚类的维度; ii)使用K-Means算法确定每个维度的聚类边界; iii)通过结合使用上一步结果的特征向量对作业进行分类; iv)在运行时分析工作负载组的特征。我们的分析揭示了一些以前的工作尚未在阿里巴巴集群跟踪中找到的见解。对于批处理作业:a)所有批处理作业的平均CPU内核明显显示出双峰分布。 b)在随机采样时间,超过50%的计算机仅运行一组持续时间短,CPU内核中等且内存利用率低的作业,其余计算机则运行混合的作业组。对于联机实例:a)大多数联机实例的资源使用率(CPU,内存和磁盘)较低; b)根据我们的聚类方法,在随机采样时间最多有六个组在同一台计算机上运行。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号