...
首页> 外文期刊>Data mining and knowledge discovery >The Swiss army knife of time series data mining: ten useful things you can do with the matrix profile and ten lines of code
【24h】

The Swiss army knife of time series data mining: ten useful things you can do with the matrix profile and ten lines of code

机译:瑞士军刀时间序列数据挖掘:您可以用矩阵配置文件和十行代码执行10个有用的东西

获取原文
获取原文并翻译 | 示例

摘要

The recently introduced data structure, the Matrix Profile, annotates a time series by recording the location of and distance to the nearest neighbor of every subsequence. This information trivially provides answers to queries for both time series motifs and time series discords, perhaps two of the most frequently used primitives in time series data mining. One attractive feature of the Matrix Profile is that it completely divorces the high-level details of the analytics performed, from the computational "heavy lifting." The Matrix Profile can be computed using the appropriate computational paradigm for the task at hand: CPU, GPU, FPGA, distributed computing, anytime computation, incremental computation, and so forth. However, all the details of such computation can be hidden from the analyst who only needs to think about her analytical need. In this work, we expand on this philosophy and ask the following question: If we assume that we get the Matrix Profile for free, what interesting analytics can we do, writing at most ten lines of code? As we will show, the answer is surprisingly large and diverse. Our aim here is not to establish or compete with state-of-the-art results, but merely to show that we can both reproduce the results of many existing algorithms and find novel regularities in time series data collections with very little effort.
机译:最近引入的数据结构,矩阵轮廓,通过将位置和距离的位置和距离记录到每个子序列的最近邻居的位置来注释时间序列。此信息介绍了对时间序列图案和时间序列不和谐的查询的答案,可能是时间序列数据挖掘中的两个最常用的原语。矩阵概况的一个有吸引力的特征是,它完全离婚,从计算“重升降”中所执行的分析的高级细节。可以使用适当的计算范例来使用适当的计算范例来计算矩阵配置文件:CPU,GPU,FPGA,分布式计算,随时计算,增量计算等。然而,这种计算的所有细节都可以隐藏在只需要考虑她的分析需求的分析师。在这项工作中,我们扩展了这一哲学并询问以下问题:如果我们假设我们免费获得矩阵配置文件,我们可以做些什么有趣的分析,以最多十行代码写作?正如我们将展示的那样,答案令人惊讶的是大而多样化。我们的目标不是与最先进的结果建立或竞争,但仅表明我们可以重现许多现有算法的结果,并在时间序列数据收集中找到新的规律性,很少的努力。

著录项

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号