An ultra-fast time series distance measure to allow data mining in more complex real-world deployments

Gharghabi Shaghayegh; Imani Shima; Bagnall Anthony; Darvishzadeh Amirali; Keogh Eamonn

首页> 外文期刊>Data mining and knowledge discovery >An ultra-fast time series distance measure to allow data mining in more complex real-world deployments

【24h】

An ultra-fast time series distance measure to allow data mining in more complex real-world deployments

机译：超快速时间序列距离测量，以允许数据挖掘更复杂的现实世界部署

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

At their core, many time series data mining algorithms reduce to reasoning about the shapes of time series subsequences. This requires an effective distance measure, and for last two decades most algorithms use Euclidean distance or DTW as their core subroutine. We argue that these distance measures are not as robust as the community seems to believe. The undue faith in these measures perhaps derives from an overreliance on the benchmark datasets and self-selection bias. The community is simply reluctant to address more difficult domains, for which current distance measures are ill-suited. In this work, we introduce a novel distance measure MPdist. We show that our proposed distance measure is much more robust than current distance measures. For example, it can handle data with missing values or spurious regions. Furthermore, it allows us to successfully mine datasets that would defeat any Euclidean or DTW distance-based algorithm. Additionally, we show that our distance measure can be computed so efficiently as to allow analytics on very fast arriving streams.

机译：在他们的核心，许多时间序列数据挖掘算法减少了关于时间序列子序列的形状的推理。这需要有效的距离测量，并且在过去二十年中，大多数算法使用欧几里德距离或DTW作为其核心子程序。我们认为这些距离措施并不像社区似乎相信的那样强大。这些措施中的过度信心可能来自对基准数据集和自我选择偏差的过度方面。社区只是不愿意解决更多困难的域名，目前距离措施是不合适的。在这项工作中，我们介绍了一个小说距离测量MPDist。我们表明，我们所提出的距离测量比当前距离措施更强大。例如，它可以处理具有缺失值或虚假区域的数据。此外，它允许我们成功地挖掘将失去任何欧几里德或DTW基于距离的算法的数据集。此外，我们表明我们的距离测量可以如此有效地计算，以允许在非常快速到达的流上进行分析。

著录项

来源
《Data mining and knowledge discovery》 |2020年第4期|共32页
作者
Gharghabi Shaghayegh; Imani Shima; Bagnall Anthony; Darvishzadeh Amirali; Keogh Eamonn;
展开▼
作者单位

Univ Calif Riverside Riverside CA 92521 USA;

Univ Calif Riverside Riverside CA 92521 USA;

Univ East Anglia Norwich Norfolk England;

Univ Calif Riverside Riverside CA 92521 USA;

Univ Calif Riverside Riverside CA 92521 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Time series; Distance measure; Matrix profile;

机译：时间序列;距离测量;矩阵概况;

相似文献

外文文献
中文文献
专利

1. An ultra-fast time series distance measure to allow data mining in more complex real-world deployments [J] . Gharghabi Shaghayegh, Imani Shima, Bagnall Anthony, Data mining and knowledge discovery . 2020,第4期

机译：超快速时间序列距离测量，以允许数据挖掘更复杂的现实世界部署
2. A New Symbolization and Distance Measure Based Anomaly Mining Approach for Hydrological Time Series [J] . Zhang Pengcheng, Xiao Yan, Zhu Yuelong, International journal of web services research . 2016,第3期

机译：基于符号和距离测度的水文时间序列异常挖掘新方法
3. MDA: A Reconfigurable Memristor-Based Distance Accelerator for Time Series Mining on Data Centers [J] . Xu Xiaowei, Lin Feng, Xu Wenyao, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems . 2019,第5期

机译：MDA：用于数据中心时间序列挖掘的可重构基于忆阻器的距离加速器
4. A LOCAL SEGMENTED DYNAMIC TIME WARPING DISTANCE MEASURE ALGORITHM FOR TIME SERIES DATA MINING [C] . XIAO-LI DONG, CHENG-KUI GU, ZHENG-OU WANG Proceedings of the 2006 International Conference on Machine Learning and Cybernetics . 2006

机译：时间序列数据挖掘的局部分段动态经纱距离测量算法
5. Mining time series data: Moving from toy problems to realistic deployments. [D] . Hu, Bing. 2013

机译：挖掘时间序列数据：从玩具问题过渡到实际部署。
6. Analysis of Solar Irradiation Time Series Complexity and Predictability by Combining Kolmogorov Measures and Hamming Distance for La Reunion (France) [O] . Dragutin T. Mihailović, Miloud Bessafi, Sara Marković, 2018

机译：Kolmogorov测量与La Reunion的汉敏距离与汉字距离相结合的太阳照射时间序列复杂性和可预测性分析
7. A new symbolization and distance measure based anomaly mining approach for hydrological time series [O] . Zhang P, Xiao Y, Zhu Y, 2016

机译：基于符号和距离测度的水文时间序列异常挖掘新方法

An ultra-fast time series distance measure to allow data mining in more complex real-world deployments

摘要

著录项

相似文献

相关主题

期刊订阅