首页> 外文期刊>Vietnam Journal of Computer Science >Similarity search for numerous patterns over multiple time series streams under dynamic time warping which supports data normalization
【24h】

Similarity search for numerous patterns over multiple time series streams under dynamic time warping which supports data normalization

机译:在动态时间规整下对多个时间序列流上的众多模式进行相似性搜索,从而支持数据归一化

获取原文
           

摘要

Abstract A huge challenge in nowadays’ data mining is similarity search in streaming time series under Dynamic Time Warping (DTW). In the similarity search, data normalization is a must to obtain accurate results. However, data normalization on the fly and the DTW calculation cost a great deal of computational time and memory space. In the paper, we present two methods, SUCR-DTW and ESUCR-DTW, which conduct similarity search for numerous prespecified patterns over multiple time-series streams under DTW supporting data normalization. These two methods utilize a combination of techniques to mitigate the aforementioned costs. The efficient methods inherit the cascading lower bounds introduced in UCR-DTW, a state-of-the-art method of similarity search in the static time series, to admissibly prune off unpromising subsequences. To be adaptive in the streaming setting, SUCR-DTW performs incremental updates on the envelopes of new-coming time-series subsequences and incremental data normalization on time-series data. However, like UCR-DTW, SUCR-DTW retrieves only similar subsequences that have the same length as the patterns. ESUCR-DTW, an extension of SUCR-DTW, can find similar subsequences whose lengths are different from those of the patterns. Furthermore, our proposed methods exploit multi-threading to have a fast response to high-speed time-series streams. The experimental results show that SUCR-DTW obtains the same precision as UCR-DTW and has lower wall clock time. Besides, the experimental results of SUCR-DTW and ESUCR-DTW reveal that the extended method has higher accuracy in spite of longer wall clock time. Also, the paper evaluates the influence of incremental z -score normalization and incremental min–max normalization on the obtained results.
机译:摘要当今数据挖掘中的一个巨大挑战是动态时间规整(DTW)下流时间序列中的相似性搜索。在相似性搜索中,数据规范化是获得准确结果的必要条件。但是,动态数据归一化和DTW计算会花费大量计算时间和存储空间。在本文中,我们介绍了两种方法,SUCR-DTW和ESUCR-DTW,它们在DTW支持数据归一化的情况下,在多个时间序列流上对许多预定模式进行相似性搜索。这两种方法利用技术的组合来减轻上述成本。有效的方法继承了UCR-DTW(在静态时间序列中进行相似性搜索的最新方法)中引入的级联下界,以允许删除不希望的子序列。为了适应流媒体设置,SUCR-DTW对新出现的时间序列子序列的包络执行增量更新,并对时间序列数据进行增量数据归一化。但是,与UCR-DTW一样,SUCR-DTW仅检索与模式长度相同的相似子序列。 ESUCR-DTW是SUCR-DTW的扩展,可以找到长度与模式长度不同的相似子序列。此外,我们提出的方法利用多线程对高速时间序列流具有快速响应。实验结果表明,SUCR-DTW具有与UCR-DTW相同的精度,并具有较低的挂钟时间。此外,SUCR-DTW和ESUCR-DTW的实验结果表明,尽管壁钟时间较长,但扩展方法仍具有较高的精度。此外,本文评估了增量z得分归一化和增量最小最大归一化对所获得结果的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号