首页> 外文会议>International conference on information and knowledge management >Prefix-Querying: An Approach for Effective Subsequence Matching Under Time Warping in Sequence Databases
【24h】

Prefix-Querying: An Approach for Effective Subsequence Matching Under Time Warping in Sequence Databases

机译:前缀查询:序列数据库中扭曲下的有效后续匹配的方法

获取原文

摘要

This paper discusses an index-based subsequence matching that supports time warping in large sequence databases. Time warping enables finding sequences with similar patterns even when they are of different lengths. In our earlier work, we suggested an efficient method for whole matching under time warping. This method constructs a multidimensional index on a set of feature vectors, which are invariant to time warping, from data sequences. For filtering at feature space, it also applies a lower-bound function, which consistently underestimates the time warping distance as well as satisfies the triangular inequality. In this paper, we incorporate the prefix-querying approach based on sliding windows into the earlier approach. For indexing, we extract a feature vector from every subsequence inside a sliding window and construct a multi-dimensional index using a feature vector as indexing attributes. For query processing, we perform a series of index searches using the feature vectors of qualifying query prefixes. Our approach provides effective and scalable subsequence matching even with a large volume of a database. We also prove that our approach does not incur false dismissal. To verify the superiority of our method, we perform extensive experiments. The results reveal that our method achieves significant speedup with real-world S&P 500 stock data and with very large synthetic data.
机译:本文讨论了基于索引的子序列匹配,支持在大序列数据库中的时间翘曲。即使当它们具有不同的长度时,时间扭曲使得能够找到具有类似模式的序列。在我们之前的工作中,我们建议在扭曲的情况下为整个匹配的有效方法。该方法在一组特征向量上构建多维索引,其从数据序列中不变时不变。为了在特征空间进行过滤,它还应用一个较低的函数,这始终低估了时间翘曲距离以及满足三角不平等。在本文中,我们将基于滑动窗口的前缀查询方法纳入前面的方法。对于索引,我们从滑动窗口内的每个子序列提取一个特征向量,并使用要素向量构造多维索引作为索引属性。对于查询处理,我们使用限定查询前缀的特征向量执行一系列索引搜索。我们的方法也提供了甚至具有大量数据库的有效和可扩展的子序列匹配。我们还证明了我们的方法不会遭受虚假解雇。为了验证我们方法的优越性,我们执行广泛的实验。结果表明,我们的方法通过现实世界的标准普尔500辆股票数据和具有非常大的合成数据来实现显着加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号