首页> 外文会议>International workshop on combinatorial algorithms >Suffix Tree of Alignment: An Efficient Index for Similar Data
【24h】

Suffix Tree of Alignment: An Efficient Index for Similar Data

机译:后缀比对树:类似数据的有效索引

获取原文

摘要

We consider an index data structure for similar strings. The generalized suffix tree can be a solution for this. The generalized suffix tree of two strings A and B is a compacted trie representing all suffixes in A and B. It has |A| + |B| leaves and can be constructed in O(|A| + |B|) time. However, if the two strings are similar, the generalized suffix tree is not efficient because it does not exploit the similarity which is usually represented as an alignment of A and B. In this paper we propose a space/time-efficient suffix tree of alignment which wisely exploits the similarity in an alignment. Our suffix tree for an alignment of A and B has | A| + l_d +l_1 leaves where l_d is the sum of the lengths of all parts of B different from A and l_l is the sum of the lengths of some common parts of A and B. We did not compromise the pattern search to reduce the space. Our suffix tree can be searched for a pattern P in O(|P| + occ) time where occ is the number of occurrences of P in A and B. We also present an efficient algorithm to construct the suffix tree of alignment. When the suffix tree is constructed from scratch, the algorithm requires O(|A| + l_d + l_1 + l_2) time where l_2 is the sum of the lengths of other common substrings of A and B. When the suffix tree of A is already given, it requires O(l_d + l_1 +l_2 ) time.
机译:我们考虑类似字符串的索引数据结构。通用后缀树可以解决此问题。两个字符串A和B的广义后缀树是一个压缩的特里树,表示A和B中的所有后缀。它具有| A |。 + | B |可以在O(| A | + | B |)时间中构造。但是,如果两个字符串相似,则通用后缀树效率不高,因为它没有利用通常表示为A和B的对齐方式的相似性。在本文中,我们提出了一种节省空间/时间的后缀树明智地利用比对中的相似性。我们用于A和B对齐的后缀树具有| A | + l_d + l_1离开,其中l_d是B与A不同的所有部分的长度之和,而l_1是A和B某些公共部分的长度之和。我们没有为了减少空间而妥协模式搜索。可以在O(| P | + occ)时间中搜索后缀树以找到模式P,其中occ是A和B中P出现的次数。我们还提出了一种构造对齐后缀树的有效算法。当从头开始构建后缀树时,该算法需要O(| A | + l_d + l_1 + l_2)时间,其中l_2是A和B的其他常见子串的长度之和。给定,它需要O(l_d + l_1 + l_2)时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号