Optimizing dynamic time warping's window width for time series data mining applications

Hoang Anh Dau; Silva Diego Furtado; Petitjean Francois; Forestier Germain; Bagnall Anthony; Mueen Abdullah; Keogh Eamonn

首页> 外文期刊>Data mining and knowledge discovery >Optimizing dynamic time warping's window width for time series data mining applications

【24h】

Optimizing dynamic time warping's window width for time series data mining applications

机译：优化动态时间翘曲的窗口宽度，用于时间序列数据挖掘应用程序

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Dynamic Time Warping (DTW) is a highly competitive distance measure for most time series data mining problems. Obtaining the best performance from DTW requires setting its only parameter, the maximum amount of warping (w). In the supervised case with ample data, w is typically set by cross-validation in the training stage. However, this method is likely to yield suboptimal results for small training sets. For the unsupervised case, learning via cross-validation is not possible because we do not have access to labeled data. Many practitioners have thus resorted to assuming that "the larger the better", and they use the largest value of w permitted by the computational resources. However, as we will show, in most circumstances, this is a na ve approach that produces inferior clusterings. Moreover, the best warping window width is generally non-transferable between the two tasks, i.e., for a single dataset, practitioners cannot simply apply the best w learned for classification on clustering or vice versa. In addition, we will demonstrate that the appropriate amount of warping not only depends on the data structure, but also on the dataset size. Thus, even if a practitioner knows the best setting for a given dataset, they will likely be at a lost if they apply that setting on a bigger size version of that data. All these issues seem largely unknown or at least unappreciated in the community. In this work, we demonstrate the importance of setting DTW's warping window width correctly, and we also propose novel methods to learn this parameter in both supervised and unsupervised settings. The algorithms we propose to learn w can produce significant improvements in classification accuracy and clustering quality. We demonstrate the correctness of our novel observations and the utility of our ideas by testing them with more than one hundred publicly available datasets. Our forceful results allow us to make a perhaps unexpected claim; an underappreciated "low hanging frui

机译：动态时间翘曲（DTW）是大多数时间序列数据挖掘问题的高竞争距离措施。从DTW获取最佳性能需要设置其唯一参数，最大的翘曲量（W）。在具有充分数据的监督案例中，W通常通过培训阶段的交叉验证设置。但是，这种方法可能会给小型训练集产生次优效果。对于无监督的情况，通过交叉验证学习是不可能的，因为我们无法访问标记数据。因此，许多从业者都采取了假设“越好”，他们使用计算资源允许的最大值。但是，在大多数情况下，我们将显示，这是一个na＆ ve方法产生劣质群集。此外，最佳的翘曲窗口宽度通常是不可转换的，即对于单个数据集，从业者不能简单地应用于对聚类的分类，反之亦然。此外，我们将证明适当的扭曲不仅取决于数据结构，还依赖于数据集大小。因此，即使从业者知道给定数据集的最佳设置，如果在更大尺寸版本的该数据上应用该设置，它们可能会丢失。所有这些问题似乎在很大程度上未知或在社区中至少被解释。在这项工作中，我们证明了正确设置DTW的翘曲窗口宽度的重要性，我们还提出了在监督和无监督的设置中学习此参数的新方法。我们建议学习W的算法可以显着改善分类准确性和聚类质量。我们展示了我们的小说观测和我们想法的实用性来证明我们的想法的实用性通过以多百以上的公共数据集进行测试。我们的有力结果允许我们提出意想不到的索赔;一个被批评的“低悬垂的Frui

著录项

来源
《Data mining and knowledge discovery》 |2018年第4期|共47页
作者
Hoang Anh Dau; Silva Diego Furtado; Petitjean Francois; Forestier Germain; Bagnall Anthony; Mueen Abdullah; Keogh Eamonn;
展开▼
作者单位

Univ Calif Riverside Riverside CA 92521 USA;

Univ Fed Sao Carlos Sao Carlos SP Brazil;

Univ Haute Alsace Mulhouse France;

Univ East Anglia Norwich Norfolk England;

Univ New Mexico Albuquerque NM 87131 USA;

Univ Calif Riverside Riverside CA 92521 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Time series; Clustering; Classification; Dynamic time warping; Semi-supervised learning;

机译：时间序列;聚类;分类;动态时间翘曲;半监督学习;

相似文献

外文文献
中文文献
专利

1. Optimizing dynamic time warping's window width for time series data mining applications [J] . Hoang Anh Dau, Silva Diego Furtado, Petitjean Francois, Data mining and knowledge discovery . 2018,第4期

机译：优化动态时间翘曲的窗口宽度，用于时间序列数据挖掘应用程序
2. On-line and dynamic time warping for time series data mining [J] . Hailin Li International journal of machine learning and cybernetics . 2015,第1期

机译：在线和动态时间规整，用于时间序列数据挖掘
3. Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping [J] . Thanawin Rakthanmanon, Bilson Campana, Abdullah Mueen, SIGKDD explorations . 2012,第CDaROM期

机译：动态时间规整下时间序列子序列的搜索与挖掘
4. Judicious setting of Dynamic Time Warping's window width allows more accurate classification of time series [C] . Hoang Anh Dau, Diego Furtado Silva, François Petitjean, IEEE International Conference on Big Data . 2017

机译：明智地设置“动态时间规整”的窗口宽度，可以对时间序列进行更准确的分类
5. Improving efficiency and effectiveness of dynamic time warping in large time series databases. [D] . Ratanamahatana, Chotirat. 2005

机译：提高大型时间序列数据库中动态时间规整的效率和有效性。
6. Addressing Big Data Time Series: Mining Trillions of Time Series Subsequences Under Dynamic Time Warping [O] . THANAWIN RAKTHANMANON, BILSON CAMPANA, ABDULLAH MUEEN, -1

机译：解决大数据时间序列：动态时间规整下挖掘数千个时间序列子序列
7. Similarity Measure Based on Incremental Warping Window for Time Series Data Mining [O] . Hailin Li, Cheng Wang 2019

机译：基于时间序列数据挖掘的增量翘曲窗口的相似度量
8. Data Stream Mining Based Dynamic Link Anomaly Analysis Using Paired Sliding Time Window Data. [R] . Han, K., Zhang, T., Liao, Q. 2014

机译：基于数据流挖掘的成对滑动时间窗数据动态链接异常分析。

Optimizing dynamic time warping's window width for time series data mining applications

摘要

著录项

相似文献

相关主题

期刊订阅