Generating Synthetic Time Series to Augment Sparse Datasets

机译：生成合成时间序列以增强稀疏数据集

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In machine learning, data augmentation is the process of creating synthetic examples in order to augment a dataset used to learn a model. One motivation for data augmentation is to reduce the variance of a classifier, thereby reducing error. In this paper, we propose new data augmentation techniques specifically designed for time series classification, where the space in which they are embedded is induced by Dynamic Time Warping (DTW). The main idea of our approach is to average a set of time series and use the average time series as a new synthetic example. The proposed methods rely on an extension of DTW Barycentric Averaging (DBA), the averaging technique that is specifically developed for DTW. In this paper, we extend DBA to be able to calculate a weighted average of time series under DTW. In this case, instead of each time series contributing equally to the final average, some can contribute more than others. This extension allows us to generate an infinite number of new examples from any set of given time series. To this end, we propose three methods that choose the weights associated to the time series of the dataset. We carry out experiments on the 85 datasets of the UCR archive and demonstrate that our method is particularly useful when the number of available examples is limited (e.g. 2 to 6 examples per class) using a 1-NN DTW classifier. Furthermore, we show that augmenting full datasets is beneficial in most cases, as we observed an increase of accuracy on 56 datasets, no effect on 7 and a slight decrease on only 22.

机译：在机器学习中，数据扩充是创建综合示例以扩充用于学习模型的数据集的过程。数据扩充的一种动机是减少分类器的方差，从而减少错误。在本文中，我们提出了专门为时间序列分类设计的新数据增强技术，其中嵌入它们的空间是由动态时间规整（DTW）引起的。我们方法的主要思想是平均一组时间序列，并将平均时间序列用作新的综合示例。所提出的方法依赖于DTW重心平均（DBA）的扩展，DBA是专为DTW开发的平均技术。在本文中，我们扩展了DBA以能够计算DTW下时间序列的加权平均值。在这种情况下，某些时间序列可以比其他时间序列贡献更多，而不是每个时间序列对最终平均值的贡献均相等。此扩展使我们可以从给定时间序列的任何集合中生成无限数量的新示例。为此，我们提出了三种选择与数据集的时间序列关联的权重的方法。我们对UCR档案的85个数据集进行了实验，并证明了当使用1-NN DTW分类器来限制可用示例的数量（例如，每个类别2至6个示例）时，我们的方法特别有用。此外，我们发现在大多数情况下，扩充完整数据集是有益的，因为我们观察到56个数据集的准确性有所提高，对7个数据集没有影响，而对22个数据集则略有下降。

著录项

来源
《IEEE International Conference on Data Mining》|2017年|865-870|共6页
会议地点
作者
Germain Forestier; François Petitjean; Hoang Anh Dau; Geoffrey I. Webb; Eamonn Keogh;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Time series analysis; Training; Data models; Heuristic algorithms; Manifolds; Conferences; Data mining;

机译：时间序列分析;训练;数据模型;启发式算法;流形;会议;数据挖掘;

相似文献

外文文献
中文文献
专利

1. Downscaling of GRACE datasets based on relevance vector machine using InSAR time series to generate maps of groundwater storage changes at local scale [J] . Shang Qi, Liu Xiangnan, Deng Xinyu, Journal of Applied Remote Sensing . 2019,第4期

机译：基于相关矢量机使用INSAR时间序列的栅格数据集缩小，以在本地规模上生成地下水存储变化的地图
2. A novel cross-sensor calibration method to generate a consistent night-time lights time series dataset [J] . Tu Ying, Zhou Hanlin, Lang Wei, International journal of remote sensing . 2020,第13a14期

机译：一种新颖的交叉传感器校准方法，可以生成一致的夜间灯时间序列数据集
3. The use of sparse CT datasets for auto-generating accurate FE models of the femur and pelvis [J] . Shim VB, Pitto RP, Streicher RM, Journal of Biomechanics . 2007,第1期

机译：使用稀疏CT数据集自动生成股骨和骨盆的精确有限元模型
4. Generating synthetic time series to augment sparse datasets [C] . Germain Forestier, Francois Petitjean, Hoang Anh Dau, IEEE International Conference on Data Mining . 2017

机译：生成合成时间序列以增强稀疏数据集
5. Time series retrieval: Indexing and mining large datasets. [D] . Shieh, Jin-Wien. 2010

机译：时间序列检索：索引和挖掘大型数据集。
6. Synthetic Generation of Myocardial Blood-Oxygen-Level-Dependent MRI Time Series via Structural Sparse Decomposition Modeling [O] . Cristian Rusu, Rita Morisi, Davide Boschetto, -1

机译：通过结构稀疏分解模型合成心肌血氧水平依赖的MRI时间序列
7. Figure 10: Network representations of the synthetic time-series dataset (Fig. 2) at the beginning of the three most persistent structures detected in the barcode-tree. [O] . -1

机译：图10：在条形码树中检测到的三个最持久的结构的开始时，合成时间序列数据集的网络表示（图2）。

Generating Synthetic Time Series to Augment Sparse Datasets

摘要

著录项

相似文献

相关主题

期刊订阅