...
首页> 外文期刊>Expert Systems with Application >Hierarchical clustering of time series data with parametric derivative dynamic time warping
【24h】

Hierarchical clustering of time series data with parametric derivative dynamic time warping

机译:参数导数动态时间规整的时间序列数据的分层聚类

获取原文
获取原文并翻译 | 示例

摘要

Dynamic Time Warping (DTW) is a popular and efficient distance measure used in classification and clustering algorithms applied to time series data. By computing the DTW distance not on raw data but on the time series of the (first, discrete) derivative of the data, we obtain the so-called Derivative Dynamic Time Warping (DDTW) distance measure. DDTW, used alone, is usually inefficient, but there exist datasets on which DDTW gives good results, sometimes much better than DTW. To improve the performance of the two distance measures, we can combine them into a new single (parametric) distance function. The literature contains examples of the combining of DTW and DDTW in algorithms for supervised classification of time series data. In this paper, we demonstrate that combination of DTW and DDTW can also be applied in a method of time series clustering (unsupervised classification). In particular, we focus on a hierarchical clustering (with average linkage) of univariate (one-dimensional) time series data. We construct a new parametric distance function; combining DTW and DDTW, where a single real number parameter controls the contribution of each of the two measures to the total value of the combined distances. The parameter is tuned in the initial phase of the clustering algorithm. Using this technique in clustering methods requires a different approach (to address certain specific problems) than for supervised methods. In the clustering process we use three internal cluster validation measures (measures which do not use labels) and three external cluster validation measures (measures which do use clustering data labels). Internal measures are used to select an optimal value of the parameter of the algorithm, where external measures give information about the overall performance of the new method and enable comparison with other distance functions. Computational experiments are performed on a large real-world data base (UCR Time Series Classification Archive: 84 datasets) from a very broad range of fields, including medicine, finance, multimedia and engineering. The experimental results demonstrate the effectiveness of the proposed approach for hierarchical clustering of time series data. The method with the new parametric distance function outperforms DTW (and DDTW) on the data base used. The results are confirmed by graphical and statistical comparison. (C) 2016 Elsevier Ltd. All rights reserved.
机译:动态时间规整(DTW)是一种流行且有效的距离度量,用于对时间序列数据进行分类和聚类的算法。通过不根据原始数据而是根据数据(一阶,离散)导数的时间序列计算DTW距离,我们获得了所谓的导数动态时间规整(DDTW)距离度量。单独使用DDTW通常效率不高,但是存在一些DDTW可以提供良好结果的数据集,有时甚至比DTW更好。为了提高两个距离度量的性能,我们可以将它们组合成一个新的单个(参数)距离函数。文献中包含将DTW和DDTW组合在时间序列数据的监督分类算法中的示例。在本文中,我们证明了DTW和DDTW的组合也可以应用于时间序列聚类(无监督分类)的方法。特别是,我们专注于单变量(一维)时间序列数据的分层聚类(具有平均链接)。我们构造一个新的参数距离函数;结合了DTW和DDTW,其中单个实数参数控制两个量度中的每一个对组合距离总值的贡献。在聚类算法的初始阶段调整参数。与监督方法相比,在聚类方法中使用此技术需要一种不同的方法(以解决某些特定问题)。在聚类过程中,我们使用三种内部聚类验证措施(不使用标签的措施)和三种外部聚类验证措施(使用聚类数据标签的措施)。内部度量用于选择算法参数的最佳值,其中外部度量可提供有关新方法总体性能的信息,并可以与其他距离函数进行比较。计算实验是在大型现实世界的数据库(UCR时间序列分类档案:84个数据集)上进行的,这些数据库来自医学,金融,多媒体和工程学等广泛领域。实验结果证明了该方法对时间序列数据的分层聚类的有效性。具有新参数距离函数的方法在所用数据库上的性能优于DTW(和DDTW)。通过图形和统计比较确认结果。 (C)2016 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号