【24h】

Similarity Search for Interval Time Sequences

机译:间隔时间序列的相似性搜索

获取原文
获取原文并翻译 | 示例

摘要

Time sequences, which are ordered sets of observations, have been studied in various database applications. In this paper, we introduce a new class of time sequences of which each observation is represented by an interval rather than a number. Such sequences may arise in many situations. For instance, we may not be able to determine the exact value at a time point due to uncertainty or aggregation. In such a case, the observation may be represented better by a range of possible values. Similarity search for interval time sequences has not been studied to the best of our knowledge and poses a new challenge for research. We first address the issue of (dis)similarity measures for interval time sequences. We choose a L_1 norm-based measure because it is semantically better than other alternatives. We next propose an efficient indexing technique for fast retrieval of similar interval time sequences from large databases. More specifically, we propose: (1) to extract a segment-based feature vector for each sequence, and (2) to map each feature vector to either a point or a hyper-rectangle in a multi-dimensional feature space. We then show how we can use existing multi-dimensional index structures such as the R-tree for efficient query processing. Our proposed method guarantees that no false dismissals would occur.
机译:时间序列是观察的有序集合,已经在各种数据库应用程序中进行了研究。在本文中,我们介绍了一种新的时间序列,其中每个观察结果都由一个间隔而不是一个数字表示。这样的顺序可能在许多情况下出现。例如,由于不确定性或聚集性,我们可能无法在某个时间点确定确切值。在这种情况下,可以通过一系列可能的值更好地表示观察结果。据我们所知,尚未对间隔时间序列的相似性搜索进行过研究,这对研究提出了新的挑战。我们首先解决间隔时间序列的(不相似)度量的问题。我们选择一种基于L_1规范的度量,因为它在语义上比其他替代方案更好。接下来,我们提出一种有效的索引技术,用于从大型数据库中快速检索相似的间隔时间序列。更具体地说,我们提出:(1)为每个序列提取基于片段的特征向量,以及(2)将每个特征向量映射到多维特征空间中的点或超矩形。然后,我们说明如何使用现有的多维索引结构(例如R树)进行有效的查询处理。我们提出的方法保证不会发生任何虚假解雇。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号