首页> 外文学位 >Functional singular value decomposition and multi-resolution anomaly detection.
【24h】

Functional singular value decomposition and multi-resolution anomaly detection.

机译:功能奇异值分解和多分辨率异常检测。

获取原文
获取原文并翻译 | 示例

摘要

This dissertation has two major parts. The first part discusses the connections and differences between the statistical tool of Principal Component Analysis (PCA) and the related numerical method of Singular Value Decomposition (SVD), and related visualization methods. The second part proposes a Multi-Resolution Anomaly Detection (MRAD) method for time series with long range dependence (LRD).; PCA is a popular method in multivariate analysis and in Functional Data Analysis (FDA). Compared to PCA, SVD is more general, because it not only provides a direct approach to calculate the principal components (PCs), but also simultaneously yields the PCAs for both the row and the column spaces. SVD has been used directly to explore and analyze data sets, and has been shown to be an insightful analysis tool in many fields. However, the connection and differences between PCA and SVD have seldom been explored from a statistical view point. Here we explore the connections and differences between PCA and SVD, and extend the usual SVD method to variations including different centerings based on various types of means. A generalized scree plot is developed to provide a visual aid for selection of different centerings. Several matrix views of the SVD components are introduced to explore different features in data, including SVD surface plots, image plots, rotation movies, and curve movies. These methods visualize both column and row information of a two-way matrix simultaneously, relate the matrix to relevant curves, and show local variations and interactions between columns and rows. Several toy examples are designed to compare the different types of centerings, and three real applications are used to illustrate the matrix views.; In the field of Internet traffic anomaly detection, different types of network anomalies exist at different time scales. This motivates anomaly detection methods that effectively exploit multiscale properties. Because time series of Internet measurements exhibit long range dependence (LRD) and self-similarity (SS), the classical outlier detection methods base on short-range dependent time series may not be suitable for identifying network anomalies. Based on a time series collected at a single scale (the finest scale), we aggregate to form time series of various scales, and propose a MRAD procedure to find anomalies which appear at different time scales. We show that this MRAD method is more conservative than a typical outlier detection method based on a given scale, and has larger power on average than any single scale outlier detection method based on some reasonable assumptions. Asymptotic distribution of the test statistic is developed as well. An MRAD map is developed to show candidate anomalies and the corresponding significance probabilities (p values). This method can be easily extended to be implemented in real time. Simulations and real examples are reported as well, to illustrate the usefulness of the MRAD method.; Keywords. Principal Component Analysis, Functional Data Analysis, Exploratory Data Analysis, Network Intrusion Detection, Outlier detection, Level Shift, Multiscale analysis, Long Range Dependence, Multiple Comparison, p values, Time Series, false discovery rate.
机译:本文分为两个主要部分。第一部分讨论了主成分分析(PCA)的统计工具与相关的奇异值分解(SVD)的数值方法以及相关的可视化方法之间的联系和区别。第二部分提出了一种具有长距离相关性(LRD)的时间序列的多分辨率异常检测(MRAD)方法。 PCA在多元分析和功能数据分析(FDA)中是一种流行的方法。与PCA相比,SVD更为通用,因为它不仅提供了一种直接方法来计算主成分(PC),而且同时生成了行和列空间的PCA。 SVD已被直接用于探索和分析数据集,并且已被证明是许多领域的有见地的分析工具。但是,很少从统计角度探讨PCA和SVD之间的联系和差异。在这里,我们探讨了PCA和SVD之间的联系和区别,并将常规SVD方法扩展到基于各种类型的手段包括不同居中的变化。开发了通用的碎石图,为选择不同的居中提供了视觉辅助。引入了SVD组件的几种矩阵视图,以探索数据中的不同功能,包括SVD表面图,图像图,旋转影片和曲线影片。这些方法同时可视化双向矩阵的列和行信息,将矩阵与相关曲线相关联,并显示列和行之间的局部变化和相互作用。设计了几个玩具示例来比较不同类型的定心,并使用三个实际应用程序来说明矩阵视图。在Internet流量异常检测领域,在不同的时间范围内存在不同类型的网络异常。这激发了可以有效利用多尺度特性的异常检测方法。由于Internet测量的时间序列表现出长距离依赖性(LRD)和自相似性(SS),因此基于短距离依赖性时间序列的经典异常值检测方法可能不适合识别网络异常。基于以单个尺度(最好的尺度)收集的时间序列,我们进行汇总以形成各种尺度的时间序列,并提出MRAD程序以查找出现在不同时间尺度上的异常。我们表明,这种MRAD方法比基于给定尺度的典型异常值检测方法更为保守,并且比基于某些合理假设的任何单个尺度异常值检测方法的平均功效更高。还建立了检验统计量的渐近分布。开发了MRAD图以显示候选异常和相应的显着概率(p值)。此方法可以轻松扩展以实时实现。还报告了仿真和实际示例,以说明MRAD方法的有用性。关键字。主成分分析,功能数据分析,探索性数据分析,网络入侵检测,离群值检测,电平移位,多尺度分析,远距离依赖,多重比较,p值,时间序列,错误发现率。

著录项

  • 作者

    Zhang, Lingsong.;

  • 作者单位

    The University of North Carolina at Chapel Hill.$bStatistics.;

  • 授予单位 The University of North Carolina at Chapel Hill.$bStatistics.;
  • 学科 Statistics.
  • 学位 Ph.D.
  • 年度 2007
  • 页码 158 p.
  • 总页数 158
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 统计学;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号