首页> 外文期刊>Expert Systems with Application >An unsupervised approach to online noisy-neighbor detection in cloud data centers
【24h】

An unsupervised approach to online noisy-neighbor detection in cloud data centers

机译:云数据中心中在线噪声邻居检测的无监督方法

获取原文
获取原文并翻译 | 示例

摘要

Resource sharing is an inherent characteristic of cloud data centers. Virtual Machines (VMs) and/or Containers that are co-located in the same physical server often compete for resources leading to interference. The noisy neighbor's effect refers to an anomaly caused by a VM/container limiting resources accessed by another one. Our main contribution is an online, lightweight and application -agnostic solution for anomaly detection, that follows an unsupervised approach. It is based on comparing models for different lags: Dirichlet Process Gaussian Mixture Models to characterize the resource usage profile of the application, and distance measures to score the similarity among models. An alarm is raised when there is an abrupt change in short-term lag (i.e. high distance score for short-term models), while the long-term state remains constant. We test the algorithm for different cloud workloads: websites, periodic batch applications, Spark-based applications, and Memcached server. We are able to detect anomalies in the CPU and memory resource usage with up to 82-96% accuracy (recall) depending on the scenario. Compared to other baseline methods, our approach is able to detect anomalies successfully, while raising low number of false positives, even in the case of applications with unusual normal behavior (e.g. periodic). Experiments show that our proposed algorithm is a lightweight and effective solution to detect noisy neighbor effect without any historical info about the application, that could also be potentially applied to other kind of anomalies. (C) 2017 Elsevier Ltd. All rights reserved.
机译:资源共享是云数据中心的固有特征。位于同一物理服务器中的虚拟机(VM)和/或容器通常会争夺资源,导致干扰。吵闹的邻居效应是指由VM /容器限制了另一个用户访问的资源所引起的异常。我们的主要贡献是一种在线,轻量级且与应用程序无关的异常检测解决方案,该解决方案遵循无监督的方法。它基于比较不同滞后的模型:Dirichlet过程高斯混合模型来表征应用程序的资源使用情况,以及距离度量来对模型之间的相似性进行评分。如果短期滞后发生突然变化(即短期模型的距离得分很高),而长期状态保持不变,则会发出警报。我们针对不同的云工作负载测试了该算法:网站,定期批处理应用程序,基于Spark的应用程序和Memcached服务器。根据情况,我们能够以高达82-96%的准确度(调用)检测CPU和内存资源使用情况的异常。与其他基准方法相比,即使在应用程序具有异常正常行为(例如定期)的情况下,我们的方法也能够成功检测异常,同时减少误报次数。实验表明,我们提出的算法是一种轻量级且有效的解决方案,可在没有任何有关应用程序的历史信息的情况下检测嘈杂的邻居效应,也可以潜在地应用于其他类型的异常情况。 (C)2017 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号