...
首页> 外文期刊>Data mining and knowledge discovery >An ultra-fast time series distance measure to allow data mining in more complex real-world deployments
【24h】

An ultra-fast time series distance measure to allow data mining in more complex real-world deployments

机译:超快速时间序列距离测量,以允许数据挖掘更复杂的现实世界部署

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

At their core, many time series data mining algorithms reduce to reasoning about the shapes of time series subsequences. This requires an effective distance measure, and for last two decades most algorithms use Euclidean distance or DTW as their core subroutine. We argue that these distance measures are not as robust as the community seems to believe. The undue faith in these measures perhaps derives from an overreliance on the benchmark datasets and self-selection bias. The community is simply reluctant to address more difficult domains, for which current distance measures are ill-suited. In this work, we introduce a novel distance measure MPdist. We show that our proposed distance measure is much more robust than current distance measures. For example, it can handle data with missing values or spurious regions. Furthermore, it allows us to successfully mine datasets that would defeat any Euclidean or DTW distance-based algorithm. Additionally, we show that our distance measure can be computed so efficiently as to allow analytics on very fast arriving streams.
机译:在他们的核心,许多时间序列数据挖掘算法减少了关于时间序列子序列的形状的推理。这需要有效的距离测量,并且在过去二十年中,大多数算法使用欧几里德距离或DTW作为其核心子程序。我们认为这些距离措施并不像社区似乎相信的那样强大。这些措施中的过度信心可能来自对基准数据集和自我选择偏差的过度方面。社区只是不愿意解决更多困难的域名,目前距离措施是不合适的。在这项工作中,我们介绍了一个小说距离测量MPDist。我们表明,我们所提出的距离测量比当前距离措施更强大。例如,它可以处理具有缺失值或虚假区域的数据。此外,它允许我们成功地挖掘将失去任何欧几里德或DTW基于距离的算法的数据集。此外,我们表明我们的距离测量可以如此有效地计算,以允许在非常快速到达的流上进行分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号