首页> 外文期刊>Data mining and knowledge discovery >Matrix profile goes MAD: variable-length motif and discord discovery in data series
【24h】

Matrix profile goes MAD: variable-length motif and discord discovery in data series

机译:矩阵档案变得疯狂:数据系列中的可变长度主题和Discord发现

获取原文
获取原文并翻译 | 示例
       

摘要

In the last 15 years, data series motif and discord discovery have emerged as two useful and well-used primitives for data series mining, with applications to many domains, including robotics, entomology, seismology, medicine, and climatology. Nevertheless, the state-of-the-art motif and discord discovery tools still require the user to provide the relative length. Yet, in several cases, the choice of length is critical and unforgiving. Unfortunately, the obvious brute-force solution, which tests all lengths within a given range, is computationally untenable. In this work, we introduce a new framework, which provides an exact and scalable motif and discord discovery algorithm that efficiently finds all motifs and discords in a given range of lengths. We evaluate our approach with five diverse real datasets, and demonstrate that it is up to 20 times faster than the state-of-the-art. Our results also show that removing the unrealistic assumption that the user knows the correct length, can often produce more intuitive and actionable results, which could have otherwise been missed.
机译:在过去的15年中,数据系列主题和不和谐发现已成为数据系列挖掘的两个有用和使用过错的原语,其中应用于许多域,包括机器人,昆虫学,地震学,医学和气候学。尽管如此,最先进的主题和不和谐的发现工具仍然需要用户提供相对长度。然而,在几个情况下,长度的选择是至关重要的和不可加素的。遗憾的是,在给定范围内测试所有长度的明显的蛮力解决方案是计算地相容的。在这项工作中,我们介绍了一个新的框架,它提供了一个精确且可扩展的主题和Discord发现算法,可有效地找到所有图案,并在给定的长度范围内不和谐。我们使用五种不同的实际数据集进行评估我们的方法,并证明它比最先进的速度快20倍。我们的结果还表明,消除了用户知道正确长度的不切实际的假设,通常可以产生更直观和可操作的结果,这可能会错过。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号