Matrix profile goes MAD: variable-length motif and discord discovery in data series

Linardi Michele; Zhu Yan; Palpanas Themis; Keogh Eamonn

首页> 外文期刊>Data mining and knowledge discovery >Matrix profile goes MAD: variable-length motif and discord discovery in data series

【24h】

Matrix profile goes MAD: variable-length motif and discord discovery in data series

机译：矩阵档案变得疯狂：数据系列中的可变长度主题和Discord发现

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In the last 15 years, data series motif and discord discovery have emerged as two useful and well-used primitives for data series mining, with applications to many domains, including robotics, entomology, seismology, medicine, and climatology. Nevertheless, the state-of-the-art motif and discord discovery tools still require the user to provide the relative length. Yet, in several cases, the choice of length is critical and unforgiving. Unfortunately, the obvious brute-force solution, which tests all lengths within a given range, is computationally untenable. In this work, we introduce a new framework, which provides an exact and scalable motif and discord discovery algorithm that efficiently finds all motifs and discords in a given range of lengths. We evaluate our approach with five diverse real datasets, and demonstrate that it is up to 20 times faster than the state-of-the-art. Our results also show that removing the unrealistic assumption that the user knows the correct length, can often produce more intuitive and actionable results, which could have otherwise been missed.

机译：在过去的15年中，数据系列主题和不和谐发现已成为数据系列挖掘的两个有用和使用过错的原语，其中应用于许多域，包括机器人，昆虫学，地震学，医学和气候学。尽管如此，最先进的主题和不和谐的发现工具仍然需要用户提供相对长度。然而，在几个情况下，长度的选择是至关重要的和不可加素的。遗憾的是，在给定范围内测试所有长度的明显的蛮力解决方案是计算地相容的。在这项工作中，我们介绍了一个新的框架，它提供了一个精确且可扩展的主题和Discord发现算法，可有效地找到所有图案，并在给定的长度范围内不和谐。我们使用五种不同的实际数据集进行评估我们的方法，并证明它比最先进的速度快20倍。我们的结果还表明，消除了用户知道正确长度的不切实际的假设，通常可以产生更直观和可操作的结果，这可能会错过。

著录项

来源
《Data mining and knowledge discovery》 |2020年第4期|共50页
作者
Linardi Michele; Zhu Yan; Palpanas Themis; Keogh Eamonn;
展开▼
作者单位

Univ Paris Paris France;

Univ Calif Riverside Riverside CA 92521 USA;

Univ Paris French Univ Inst IUF Paris France;

Univ Calif Riverside Riverside CA 92521 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Data series; Time series; Motif discovery; Variable length; Data mining;

机译：数据系列;时间序列;主题发现;可变长度;数据挖掘;

相似文献

外文文献
中文文献
专利

1. Matrix profile goes MAD: variable-length motif and discord discovery in data series [J] . Linardi Michele, Zhu Yan, Palpanas Themis, Data mining and knowledge discovery . 2020,第4期

机译：矩阵档案变得疯狂：数据系列中的可变长度主题和Discord发现
2. Time series joins, motifs, discords and shapelets: a unifying view that exploits the matrix profile [J] . Yeh Chin-Chia Michael, Zhu Yan, Ulanova Liudmila, Data mining and knowledge discovery . 2018,第1期

机译：时间序列加入，图案，不和谐和地形：一个利用矩阵配置文件的统一视图
3. Disk aware discord discovery: finding unusual time series in terabyte sized datasets [J] . Dragomir Yankov, Eamonn Keogh, Umaa Rebbapragada Knowledge and information systems . 2008,第2期

机译：磁盘感知不和谐发现：在TB级数据集中发现异常时间序列
4. Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View That Includes Motifs, Discords and Shapelets [C] . Chin-Chia Michael Yeh, Yan Zhu, Liudmila Ulanova, IEEE International Conference on Data Mining . 2016

机译：矩阵配置文件I：时间序列的所有对相似度联接：包括主题，不和谐和小波的统一视图
5. The Matrix Profile: Scalable Algorithms and New Primitives for Time Series Data Mining [D] . Zhu, Yan. 2018

机译：矩阵配置文件：时间序列数据挖掘的可伸缩算法和新原语
6. Probabilistic variable-length segmentation of protein sequences for discriminative motif discovery (DiMotif) and sequence embedding (ProtVecX) [O] . Ehsaneddin Asgari, Alice C. McHardy, Mohammad R. K. Mofrad -1

机译：蛋白质序列的概率可变长度分割用于区分基序发现（DiMotif）和序列嵌入（ProtVecX）
7. Matrix profile goes MAD: variable-length motif and discord discovery in data series [O] . Michele Linardi, Yan Zhu, Themis Palpanas, 2020

机译：矩阵配置文件发疯：数据系列中的可变长度主题和discord发现

Matrix profile goes MAD: variable-length motif and discord discovery in data series

摘要

著录项

相似文献

相关主题

期刊订阅