首页> 外文学位 >High performance algorithms for multiple streaming time series.
【24h】

High performance algorithms for multiple streaming time series.

机译:多个流时间序列的高性能算法。

获取原文
获取原文并翻译 | 示例

摘要

Data arriving in time order (a data stream) arises in fields ranging from physics to finance to medicine to music, to name a few. Often the data comes from sensors (in physics and medicine for example) whose data rates continue to improve dramatically as sensor technology improves. Furthermore, the number of sensors is increasing, so analyzing data between sensors becomes ever more critical in order to distill knowledge from the data. Fast response is desirable in many applications (e.g. to aim a telescope at an activity of interest or to perform a stock trade). In applications such as finance, recent information, e.g. correlation, is of far more interest than older information, so analysis over sliding windows is a desired operation.; These three factors---huge data size, fast response, and windowed computation---motivated this work. Our intent is to build a foundational library of primitives to perform online or near online statistical analysis, e.g. windowed correlation, incremental matching pursuit, burst detection, on thousands or even millions of time series. Beside the algorithms, we also propose the concept of "uncooperative" time series, whose power spectra are spread over all frequencies with any regularity.; Previous work showed how to do windowed correlation with Fast Fourier Transforms and Wavelet Transforms, but such techniques don't work for uncooperative time series. This thesis will show how to use sketches (random projections) in a way that combines several simple techniques---sketches, convolution, structured random vectors, grid structures, combinatorial design, and bootstrapping---to achieve high performance, windowed correlation over a variety of data sets. Experiments confirm the asymptotic analysis. To conduct matching pursuit (MP) over time series windows, an incremental scheme is designed to reduce the computational effort. Our empirical study demonstrates a substantial improvement in speed.; In previous work, Zhu and Shasha introduced an efficient algorithm to monitor bursts within windows of multiple sizes. We implemented it in a physical system by overcoming several practical challenges. Experimental results support the authors' linear running time analysis.
机译:按时间顺序到达的数据(数据流)出现在从物理学到金融学到医学到音乐等领域。数据通常来自传感器(例如,在物理和医学领域),随着传感器技术的进步,其数据速率会持续显着提高。此外,传感器的数量在增加,因此为了从数据中提取知识,分析传感器之间的数据变得越来越重要。在许多应用中都需要快速响应(例如,将望远镜瞄准感兴趣的活动或进行股票交易)。在诸如金融的应用中,最新信息例如相关性比旧信息更受关注,因此在滑动窗口上进行分析是一项理想的操作。这三个因素-巨大的数据大小,快速响应和窗口计算-推动了这项工作。我们的目的是建立基本的基本库,以执行在线或近乎在线的统计分析,例如窗口相关,增量匹配追踪,猝发检测,涉及数千甚至数百万个时间序列。除了这些算法,我们还提出了“不合作”时间序列的概念,该时间序列的功率谱以任何规律分布在所有频率上。先前的工作显示了如何使用快速傅立叶变换和小波变换进行窗口相关,但是这种技术不适用于不合作的时间序列。本文将展示如何使用草图(随机投影)结合几种简单技术(草图,卷积,结构化随机向量,网格结构,组合设计和自举)的方式来实现高性能,窗口相关各种数据集。实验证实了渐近分析。为了在时间序列窗口上进行匹配追踪(MP),设计了一种增量方案来减少计算量。我们的经验研究表明速度有了实质性的提高。在先前的工作中,Zhu和Shasha引入了一种有效的算法来监视多种大小的窗口内的突发。我们通过克服一些实际挑战在物理系统中实现了它。实验结果支持作者的线性运行时间分析。

著录项

  • 作者

    Zhao, Xiaojian.;

  • 作者单位

    New York University.;

  • 授予单位 New York University.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2006
  • 页码 99 p.
  • 总页数 99
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号