首页> 外文会议>IEEE International Conference on Big Data >Dependency analysis of cloud applications for performance monitoring using recurrent neural networks
【24h】

Dependency analysis of cloud applications for performance monitoring using recurrent neural networks

机译:使用递归神经网络的云应用程序性能分析的依赖性分析

获取原文

摘要

Performance monitoring of cloud-native applications that consist of several micro-services involves the analysis of time series data collected from the infrastructure, platform, and application layers of the cloud software stack. The analysis of the runtime dependencies amongst the component microservices is an essential step towards performing cloud resource management, detecting anomalous behavior of cloud applications, and meeting customer Service Level Agreements (SLAs). Finding such dependencies is challenging due to the non-linear nature of interactions, aberrant data measurements and lack of domain knowledge. In this paper, we propose a novel use of the modeling capability of Long-Short Term Memory (LSTM) recurrent neural networks, which excel in capturing temporal relationships in multi-variate time series data and being resilient to noisy pattern representations. Our proposed technique looks into the LSTM model structure, to uncover dependencies amongst performance metrics, which were learned during training. We further apply this technique in three monitoring use cases, namely finding the strongest performance predictors, discovering lagged/temporal dependencies, and improving the accuracy of forecasting for a given metric. We demonstrate the viability of our approach, by comparing the results of our proposed method in the three use cases with those obtained from previously proposed methods, such as Granger causality and the classical statistical time series analysis models, such as ARIMA and Holt-Winters. For our experiments and analysis, we use performance monitoring data collected from two sources: a controlled experiment involving a sample cloud application that we deployed in a public cloud infrastructure and cloud monitoring data collected from the monitoring service of an operational, public cloud service provider.
机译:由多个微服务组成的云原生应用程序的性能监视涉及对从云软件堆栈的基础架构,平台和应用程序层收集的时间序列数据的分析。分析组件微服务之间的运行时依赖关系是迈向执行云资源管理,检测云应用程序的异常行为并满足客户服务水平协议(SLA)的重要步骤。由于交互作用的非线性性质,异常的数据测量以及缺乏领域知识,因此找到这样的依赖关系是具有挑战性的。在本文中,我们提出了对长短时记忆(LSTM)递归神经网络建模功能的一种新颖用途,该功能擅长捕获多变量时间序列数据中的时间关系,并且对嘈杂的模式表示具有弹性。我们提出的技术着眼于LSTM模型结构,以发现绩效指标之间的依赖性,这些绩效指标是在培训期间学习到的。我们进一步将该技术应用于三个监视用例,即找到性能最强的预测器,发现滞后/时间依存关系以及提高给定指标的预测准确性。通过将我们在3个用例中提出的方法的结果与从先前提出的方法(例如Granger因果关系)和经典统计时间序列分析模型(例如ARIMA和Holt-Winters)获得的结果进行比较,我们证明了该方法的可行性。在我们的实验和分析中,我们使用从两个来源收集的性能监视数据:一个包含在公共云基础架构中部署的示例云应用程序的受控实验,以及从运营中的公共云服务提供商的监视服务收集的云监视数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号