首页> 外文期刊>Journal of intelligent & fuzzy systems: Applications in Engineering and Technology >Combining raw and normalized data in multivariate time series classification with dynamic time warping
【24h】

Combining raw and normalized data in multivariate time series classification with dynamic time warping

机译:用动态时间翘曲结合生成归一系列数据中的多变量时间序列分类

获取原文
获取原文并翻译 | 示例
           

摘要

Data normalization is one of the most common processing methods applied to raw data before its subsequent use in data mining algorithms, classification, or clustering methods. Many procedures, particularly those that use any statistical analysis, require that data be normalized in one way or another. In the case of time series a standard method of processing raw data is z-normalization of each time series instance in the data set. For multivariate (multidimensional) time series we z-normalize each dimension (variable) individually. Although normalization brings a lot of advantages, it is easy to find examples of data sets where normalization destroys information contained in the raw data. In this paper we demonstrate, that for multivariate time series (MTS) both raw and normalized components give some information about the data and the best way of mining it is a combination of them. We focus here on multidimensional time series and their classification using the nearest neighbor method with the dynamic time warping (DTW) distance measure. We construct a parametric distance measure that is a combination of DTW on raw and z-normalized time series data. It turns out that the combined distance measure carries more information about the data than the two distance components separately. By determining an individual parameter for each data set it is possible to obtain a lower classification error than the errors of both component distance measures. We perform experiments on real data sets from many fields of science and technology. The advantage of the combined approach is confirmed by graphical and statistical comparisons.
机译:数据归一化是在其随后在数据挖掘算法,分类或聚类方法中使用之前应用于原始数据的最常见的处理方法之一。许多程序,特别是那些使用任何统计分析的程序要求数据以某种方式归一化。在时间序列的情况下,处理原始数据的标准方法是数据集中的每个时间序列实例的z标准化。对于多变量(多维)时间序列,我们单独z-归一化每个维度(变量)。虽然归一化带来了很多优点,但很容易找到数据集的示例,其中归一化会破坏原始数据中包含的信息。在本文中,我们演示,即对于多变量时间序列(MTS)原始和规范化组件提供有关数据的一些信息和最佳的挖掘方式是它们的组合。我们在这里专注于多维时间序列及其分类,使用最近的邻近方法具有动态时间翘曲(DTW)距离测量。我们构造了参数距离测量,该测量是DTW上的原始和Z归一化时间序列数据的组合。事实证明,组合距离测量与分别的两个距离分量有关数据的更多信息。通过确定每个数据集的单独参数,可以获得比两个分量距离测量的错误的较低分类误差。我们对来自许多科学技术领域的真实数据集进行实验。通过图形和统计比较确认了组合方法的优点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号