首页> 外文会议>International Conference on Knowledge Engineering and Knowledge Management >Combining Machine Learning and Semantics for Anomaly Detection
【24h】

Combining Machine Learning and Semantics for Anomaly Detection

机译:组合机器学习和语义对异常检测

获取原文

摘要

The emergence of the Internet of Things and stream processing forces large scale organizations to consider anomaly detection as a key component of their business. Using machine learning to solve such complex use cases is generally a cumbersome, costly, time-consuming and error-prone process. It involves many tasks from data cleansing, to dimension reduction, algorithm selection and fine tuning. It also requires the involvement of various experts such as statisticians, programmers and testers. With RAMSSES, we remove the burden of this pipeline and demonstrate that these tasks can be automated. Our system leverages on a Lambda architecture based on Apache Spark to analyze historical data, perform cleansing and deal with the curse of dimensionality. Then, it identifies the most interesting attributes and uses a continuous semantic query generator executed over streams. The sampled data are processed by self-selected machine learning methods to detect anomalies, an iterative process using end user annotations improves significantly the accuracy of the system. After a description of RAMSSES's main components, the performance and relevancy of the system are demonstrated via a thorough evaluation over real-world and synthetic datasets.
机译:事物互联网和流处理的出现强制大规模组织将异常检测视为其业务的关键组成部分。使用机器学习来解决这些复杂的用例通常是一个繁琐的,昂贵,耗时和出错的过程。它涉及数据清理的许多任务,以减少尺寸,算法选择和微调。它还需要参与各种专家,如统计员,程序员和测试人员。使用Ramsses,我们消除了该管道的负担,并证明了这些任务可以自动化。我们的系统利用基于Apache Spark的Lambda架构来分析历史数据,执行清洁和处理维度的诅咒。然后,它标识最有趣的属性,并使用在流中执行的连续语义查询生成器。采样数据由自选择的机器学习方法处理,以检测异常,使用最终用户注释的迭代过程显着提高了系统的准确性。在RAMSSES的主要组件描述后,通过对现实世界和合成数据集进行全面评估来证明系统的性能和相关性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号