Distributed Discord Discovery: Spark Based Anomaly Detection in Time Series

机译：分布式Discord发现：时间序列中基于Spark的异常检测

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The computational complexity of discord discovery is O(m), where m is the size of time series. Many promising methods were proposed to resolve this compute-intensive problem. These methods sequentially discover discords on standalone machine. The limited capability of standalone machine in terms of computing and memory capacity hinders these methods in discovering discords from large dataset in reasonable time. In this work, we propose a distributed discord discovery method. Our method is able to combine discord results from different computing nodes, which are non-combinable in previous literature. We mitigate the issue of the memory wall by using distributed data partitioning. We implement our method on distributed Spark computing framework and distributed HDFS (Hadoop Distributed File System) storage platform. The implementation exhibits superior scalability and enables discords discovery in multi-dimension time series. We evaluate our method with terabyte-sized dataset, which is larger than any dataset in previous literature. Evaluation results show that our method has clear advantage in terms of performance and efficiency over state-of-the-art algorithms.

机译：不和谐发现的计算复杂度为O（m），其中m是时间序列的大小。提出了许多有前途的方法来解决此计算密集型问题。这些方法在独立计算机上顺序发现不和谐。独立计算机在计算和内存容量方面的有限能力阻碍了这些方法在合理的时间内从大型数据集中发现不一致的地方。在这项工作中，我们提出了一种分布式不和谐发现方法。我们的方法能够合并来自不同计算节点的不一致结果，这在以前的文献中是不可合并的。我们通过使用分布式数据分区来减轻内存墙的问题。我们在分布式Spark计算框架和分布式HDFS（Hadoop分布式文件系统）存储平台上实现我们的方法。该实现具有出色的可伸缩性，并且可以在多维时间序列中发现不和谐。我们使用TB级数据集评估我们的方法，该数据集比以前文献中的任何数据集都大。评估结果表明，相对于最新算法，我们的方法在性能和效率上具有明显的优势。

著录项

来源
《2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, 2015 IEEE 12th International Conference on Embedded Software and Systems》|2015年|154-159|共6页
会议地点 New York NY(US)
作者
Yafei Wu; Yongxin Zhu; Tian Huang; Xinyang Li; Xinyi Liu; Mengyun Liu;
展开▼
作者单位

Sch. of Microelectron., Shanghai Jiao Tong Univ., Shanghai, China;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
computational complexity; distributed processing; parallel processing; security of data; time series; O(msup2/sup); anomaly detection; computational complexity; distributed Hadoop distributed file system storage platform; distributed data partitioning; distributed discord discovery method; distributed spark computing framework; distributed storage platform; memory capacity hinder; multidimension time series; Acceleration; Algorithm design and analysis; Clustering algorithms; Force; Microelectronics; Sparks; Time;

机译：计算复杂度分布式处理并行处理数据安全性时间序列O（m ^{2 ）异常检测计算复杂度分布式Hadoop分布式文件系统存储平台分布式数据分区分布式不和谐发现方法;分布式火花计算框架;分布式存储平台;内存容量障碍;多维时间序列;加速;算法设计与分析;聚类算法;力;微电子学;火花;时间;}
入库时间 2022-08-26 13:53:55

相似文献

外文文献
中文文献
专利

1. Time Series Motif Discovery and Anomaly Detection Based on Subseries Join [J] . Yi Lin, Michael D. McCool, Ali A. Ghorbani IAENG Internaitonal journal of computer science . 2010,第3期

机译：基于子序列连接的时间序列主题发现与异常检测
2. A comparison of two blending-based ensemble techniques for network anomaly detection in Spark distributed environment [J] . Kaur Gagandeep, Jain Meenal International journal of ad hoc and ubiquitous computing . 2020,第2期

机译：三种基于混合的集合技术对火花分布式环境中的网络异常检测的比较
3. Robust and Accurate Anomaly Detection in ECG Artifacts Using Time Series Motif Discovery [J] . HaemwaanSivaraks, Chotirat AnnRatanamahatana Computational and mathematical methods in medicine . 2015,第3a4期

机译：使用时间序列主题发现的ECG工件中的鲁棒和精确的异常检测
4. Distributed Discord Discovery: Spark Based Anomaly Detection in Time Series [C] . Yafei Wu, Yongxin Zhu, Tian Huang, IEEE International Conference on High Performance Computing and Communications . 2015

机译：分布式Discord发现：时间序列基于火花的异常检测
5. Unsupervised Multivariate Time Series Anomaly Detection via Transformer-Based Models and Time Series Encoding [D] . Duan, Tinglin. 2021

机译：无监督的多变量时间序列异常检测通过基于变压器的模型和时间序列编码
6. Robust and Accurate Anomaly Detection in ECG Artifacts Using Time Series Motif Discovery [O] . Haemwaan Sivaraks, Chotirat Ann Ratanamahatana 2015

机译：使用时间序列主题发现对心电图工件进行鲁棒且准确的异常检测
7. Time Series Motif Discovery and Anomaly Detection Based on Subseries Join [O] . Yi Lin, Michael D. McCool, Ali A. Ghorbani 2010

机译：基于子系列连接的时间序列motif发现与异常检测

Distributed Discord Discovery: Spark Based Anomaly Detection in Time Series

摘要

著录项

相似文献

相关主题

期刊订阅