Similarity Search for Interval Time Sequences

机译：间隔时间序列的相似性搜索

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Time sequences, which are ordered sets of observations, have been studied in various database applications. In this paper, we introduce a new class of time sequences of which each observation is represented by an interval rather than a number. Such sequences may arise in many situations. For instance, we may not be able to determine the exact value at a time point due to uncertainty or aggregation. In such a case, the observation may be represented better by a range of possible values. Similarity search for interval time sequences has not been studied to the best of our knowledge and poses a new challenge for research. We first address the issue of (dis)similarity measures for interval time sequences. We choose a L_1 norm-based measure because it is semantically better than other alternatives. We next propose an efficient indexing technique for fast retrieval of similar interval time sequences from large databases. More specifically, we propose: (1) to extract a segment-based feature vector for each sequence, and (2) to map each feature vector to either a point or a hyper-rectangle in a multi-dimensional feature space. We then show how we can use existing multi-dimensional index structures such as the R-tree for efficient query processing. Our proposed method guarantees that no false dismissals would occur.

机译：时间序列是观察的有序集合，已经在各种数据库应用程序中进行了研究。在本文中，我们介绍了一种新的时间序列，其中每个观察结果都由一个间隔而不是一个数字表示。这样的顺序可能在许多情况下出现。例如，由于不确定性或聚集性，我们可能无法在某个时间点确定确切值。在这种情况下，可以通过一系列可能的值更好地表示观察结果。据我们所知，尚未对间隔时间序列的相似性搜索进行过研究，这对研究提出了新的挑战。我们首先解决间隔时间序列的（不相似）度量的问题。我们选择一种基于L_1规范的度量，因为它在语义上比其他替代方案更好。接下来，我们提出一种有效的索引技术，用于从大型数据库中快速检索相似的间隔时间序列。更具体地说，我们提出：（1）为每个序列提取基于片段的特征向量，以及（2）将每个特征向量映射到多维特征空间中的点或超矩形。然后，我们说明如何使用现有的多维索引结构（例如R树）进行有效的查询处理。我们提出的方法保证不会发生任何虚假解雇。

著录项

来源
《International Conference on Database Systems for Advanced Applications(DASFAA 2004); 20040317-20040319; Jeju Island; KR》|2004年|P.232-243|共12页
会议地点 Jeju Island(KR);Jeju Island(KR)
作者
Byoung-Kee Yi; Jong-Won Roh;
展开▼
作者单位

Department of Computer Science and Engineering Pohang University of Science and Technology San 31, Hyoja-dong, Pohang, Korea;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类各种专用数据库;
关键词

相似文献

外文文献
中文文献
专利

1. Automated protein sequence database classification.I.Integration of compositional similarity search,local similarity search,and multiple sequence alignment [J] . Jerome Gracy... Bioinformatics . 1998,第2期

机译：自动化蛋白质序列数据库分类.I。组成相似性搜索，局部相似性搜索和多序列比对的整合
2. Reference interval-free quick retrieval from time sequence data - reference interval-free active search (RIFAS) [J] . Takuichi Nishimura, Shinobu Ogi, Nobuhiro Sekimoto, 電子情報通信学会技術研究報告. パターン認識·メディア理解. Pattern Recognition and Media Understanding . 2000,第311期

机译：从时序数据中无参考间隔的快速检索-无参考间隔的主动搜索（RIFAS）
3. Reference interval-free quick retrieval from time sequence data - reference interval-free active search (RIFAS) [J] . Takuichi Nishimura, Shinobu Ogi, Nobuhiro Sekimoto, 電子情報通信学会技術研究報告. パターン認識·メディア理解. Pattern Recognition and Media Understanding . 2000,第311期

机译：从时间序列数据的引用间隔快速检索 - 免费间隔活动搜索（Rifas）
4. Similarity Search for Interval Time Sequences [C] . Byoung-Kee Y, Jong-Won Roh International Conference on Database Systems for Advanced Applications . 2004

机译：相似性搜索间隔时间序列
5. Sequence and structure similarity search in biological and XML databases. [D] . Aghili, S. Alireza. 2005

机译：生物和XML数据库中的序列和结构相似性搜索。
6. Convex hulls in hamming space enable efficient search for similarity and clustering of genomic sequences [O] . David S. Campo, Yury Khudyakov 2020

机译：汉明空间的凸壳能够有效地寻求基因组序列的相似性和聚类
7. Automated protein sequence database classification. I. Integration of compositional similarity search, local similarity search, and multiple sequence alignment [O] . J. Gracy, P. Argos 1998

机译：自动蛋白质序列数据库分类。 I.集成组成相似性搜索，局部相似性搜索和多个序列对齐的集成

Similarity Search for Interval Time Sequences

摘要

著录项

相似文献

相关主题

期刊订阅