Efficient Subsequence Matching Using the Longest Common Subsequence with a Dual Match Index

机译：使用最长的公共子序列和双重匹配索引进行有效的子序列匹配

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The purpose of subsequence matching is to find a query sequence from a long data sequence. Due to the abundance of applications, many solutions have been proposed. Virtually all previous solutions use the Euclidean measure as the basis for measuring distance between sequences. Recent studies, however, suggest that the Euclidean distance often fails to produce proper results due to the irregularity in the data, which is not so uncommon in our problem domain. Addressing this problem, some non-Euclidean measures, such as Dynamic Time Warping (DTW) and Longest Common Subsequence (LCS), have been proposed. However, most of the previous work in this direction focused on the whole sequence matching problem where query and data sequences are the same length. In this paper, we propose a novel subsequence matching framework using a non-Euclidean measure, in particular, LCS, and a new index query scheme. The proposed framework is based on the Dual Match framework where data sequences are divided into a series of disjoint equi-length subsequences and then indexed in an R-tree. We introduced similarity bound for index matching with LCS. The proposed query matching scheme reduces significant numbers of false positives in the match result. Furthermore, we developed an algorithm to skip expensive LCS computations through observing the warping paths. We validated our framework through extensive experiments using 48 different time series datasets. The results of the experiments suggest that our approach significantly improves the subsequence matching performance in various metrics.

机译：子序列匹配的目的是从长数据序列中查找查询序列。由于大量的应用，已经提出了许多解决方案。几乎所有以前的解决方案都使用欧几里得度量作为测量序列之间距离的基础。然而，最近的研究表明，由于数据的不规则性，欧几里得距离通常无法产生适当的结果，这在我们的问题域中并不少见。为了解决这个问题，已经提出了一些非欧几里得度量，例如动态时间规整（DTW）和最长公共子序列（LCS）。但是，此方向上的大多数先前工作都集中在查询和数据序列长度相同的整个序列匹配问题上。在本文中，我们提出了一种使用非欧氏测度（尤其是LCS）的新型子序列匹配框架以及新的索引查询方案。提出的框架基于对偶匹配框架，其中数据序列被分为一系列不相交的等长子序列，然后在R树中进行索引。我们引入了与LCS进行索引匹配的相似性界限。所提出的查询匹配方案减少了匹配结果中的大量误报。此外，我们开发了一种算法，可通过观察翘曲路径来跳过昂贵的LCS计算。我们使用48个不同的时间序列数据集通过广泛的实验验证了我们的框架。实验结果表明，我们的方法可以显着改善各种指标下的子序列匹配性能。

著录项

来源
《Machine Learning and Data Mining in Pattern Recognition(MLDM 2007); 20070718-20; Leipzig(DE)》|2007年|P.585-600|共16页
会议地点 Leipzig(DE)
作者
Tae Sik Han; Seung-Kyu Ko; Jaewoo Kang;
展开▼
作者单位

Dept. Of Computer Science, North Carolina State University Raleigh, NC 27569, USA;

Dept. Of Computer Science and Engineering, Korea University, Seoul 136-705, Korea;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算机的应用;
关键词
subsequence matching; longest common subsequence; dual match;

机译：子序列匹配;最长公共子序列;双重匹配;

相似文献

外文文献
中文文献
专利

1. Efficient time-series subsequence matching using duality in constructing windows [J] . Yang-Sae Moon, Kyu-Young Whang, Woong-Kee Loh Information Systems . 2001,第4期

机译：在构造窗口中使用对偶性的高效时间序列子序列匹配
2. Online signature verification based on signatures turning angle representation using longest common subsequence matching [J] . K. Barkoula, G. Economou, S. Fotopoulos International Journal on Document Analysis and Recognition (IJDAR) . 2013,第3期

机译：使用最长公共子序列匹配的基于签名转折角表示的在线签名验证
3. Online signature verification based on signatures turning angle representation using longest common subsequence matching [J] . K. Barkoula, G. Economou, S. Fotopoulos International Journal on Document Analysis and Recognition . 2013,第3期

机译：使用最长公共子序列匹配的基于签名转折角表示的在线签名验证
4. Efficient Subsequence Matching Using the Longest Common Subsequence with a Dual Match Index [C] . Tae Sik Han, Seung-Kyu Ko, Jaewoo Kang Machine Learning and Data Mining in Pattern Recognition(MLDM 2007) . 2007

机译：使用具有双匹配索引的最长常见子项的高效子序列匹配
5. Efficient subsequence matching with LCS. [D] . Han, Tae Sik. 2007

机译：与LCS的高效子序列匹配。
6. A Novel Efficient Graph Model for the Multiple Longest Common Subsequences (MLCS) Problem [O] . Zhan Peng, Yuping Wang 2017

机译：多重最长公共子序列（MLCS）问题的新型有效图模型
7. Efficient Time-Series Subsequence Matching using Duality in Constructing Windows [O] . Yang-sae Moon, Kyu-young Whang, Woong-kee Loh 2000

机译：构造Windows时使用对偶性的高效时间序列子序列匹配

Efficient Subsequence Matching Using the Longest Common Subsequence with a Dual Match Index

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅