Disk aware discord discovery: finding unusual time series in terabyte sized datasets

Dragomir Yankov; Eamonn Keogh; Umaa Rebbapragada

首页> 外文期刊>Knowledge and Information Systems >Disk aware discord discovery: finding unusual time series in terabyte sized datasets

【24h】

Disk aware discord discovery: finding unusual time series in terabyte sized datasets

机译：磁盘感知不和谐发现：在TB级数据集中发现异常时间序列

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The problem of finding unusual time series has recently attracted much attention, and several promising methods are now in the literature. However, virtually all proposed methods assume that the data reside in main memory. For many real-world problems this is not be the case. For example, in astronomy, multi-terabyte time series datasets are the norm. Most current algorithms faced with data which cannot fit in main memory resort to multiple scans of the disk /tape and are thus intractable. In this work we show how one particular definition of unusual time series, the time series discord, can be discovered with a disk aware algorithm. The proposed algorithm is exact and requires only two linear scans of the disk with a tiny buffer of main memory. Furthermore, it is very simple to implement. We use the algorithm to provide further evidence of the effectiveness of the discord definition in areas as diverse as astronomy, web query mining, video surveillance, etc., and show the efficiency of our method on datasets which are many orders of magnitude larger than anything else attempted in the literature.

机译：寻找不寻常的时间序列的问题最近引起了很多关注，并且文献中现在有几种有前途的方法。但是，实际上所有提出的方法都假定数据驻留在主存储器中。对于许多实际问题，情况并非如此。例如，在天文学中，数TB的时间序列数据集是常态。当前大多数面对无法容纳在主存储器中的数据的算法都会对磁盘/磁带进行多次扫描，因此难以处理。在这项工作中，我们展示了如何使用磁盘感知算法发现异常时间序列的一种特殊定义，即时间序列不一致。所提出的算法是精确的，只需要对磁盘进行两次线性扫描，并带有一个很小的主内存缓冲区。此外，它非常容易实现。我们使用该算法为不和谐定义在天文学，网络查询挖掘，视频监视等领域的有效性提供了进一步的证据，并证明了我们的方法在比任何事物大许多数量级的数据集上的有效性其他尝试在文学中。

著录项

来源
《Knowledge and Information Systems》 |2008年第2期|p.241-262|共22页
作者
Dragomir Yankov; Eamonn Keogh; Umaa Rebbapragada;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Time series; Discords; Distance outliers; Disk aware algorithms;

机译：时间序列;不和谐;距离离群值;磁盘感知算法;

相似文献

外文文献
中文文献
专利

1. Disk aware discord discovery: finding unusual time series in terabyte sized datasets [J] . Dragomir Yankov, Eamonn Keogh, Umaa Rebbapragada Knowledge and information systems . 2008,第2期

机译：磁盘感知不和谐发现：在TB级数据集中发现异常时间序列
2. iSAX: disk-aware mining and indexing of massive time series datasets [J] . Jin Shieh, Eamonn Keogh Data Mining and Knowledge Discovery . 2009,第1期

机译：iSAX：大量时间序列数据集的磁盘感知挖掘和索引
3. iSAX: disk-aware mining and indexing of massive time series datasets [J] . Shieh J, Keogh E Data mining and knowledge discovery . 2009,第1期

机译：iSAX：大量时间序列数据集的磁盘感知挖掘和索引
4. Disk Aware Discord Discovery: Finding Unusual Time Series in Terabyte Sized Datasets [C] . Dragomir Yankov, Eamonn Keogh, Umaa Rebbapragada International Conference on Data Mining . 2007

机译：磁盘感知Discord发现：在Terabyte大小的数据集中查找不寻常的时间序列
5. Time series retrieval: Indexing and mining large datasets. [D] . Shieh, Jin-Wien. 2010

机译：时间序列检索：索引和挖掘大型数据集。
6. Four-dimensional noise reduction using the time series of medical computed tomography datasets with short interval times: a static-phantom study [O] . Tatsuya Nishii, Atsushi K. Kono, Wakiko Tani, -1

机译：使用间隔时间较短的医学计算机断层摄影数据集的时间序列进行四维降噪：静态幻像研究
7. Disk aware discord discovery: Finding unusual time series in terabyte sized datasets [O] . Dragomir Yankov Eamonn Keogh, Umaa Rebbapragada 2007

机译：磁盘感知不一致发现：在TB级别的数据集中查找异常时间序列

Disk aware discord discovery: finding unusual time series in terabyte sized datasets

摘要

著录项

相似文献

相关主题

期刊订阅