Linearized Suffix Tree: An Efficient Index Data Structure With The Capabilities Of Suffix Trees And Suffix Arrays

Dong Kyue Kim; Minhwan Kim; Heejin Park

首页> 外文期刊>Algorithmica >Linearized Suffix Tree: An Efficient Index Data Structure With The Capabilities Of Suffix Trees And Suffix Arrays

【24h】

Linearized Suffix Tree: An Efficient Index Data Structure With The Capabilities Of Suffix Trees And Suffix Arrays

机译：线性化后缀树：具有后缀树和后缀数组功能的高效索引数据结构

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Suffix trees and suffix arrays are fundamental full-text index data structures to solve problems occurring in string processing.Since suffix trees and suffix arrays have different capabilities,some problems are solved more efficiently using suffix trees and others are solved more efficiently using suffix arrays.We consider efficient index data structures with the capabilities of both suffix trees and suffix arrays without requiring much space.When the size of an alphabet is small,enhanced suffix arrays are such index data structures.However,when the size of an alphabet is large,enhanced suffix arrays lose the power of suffix trees.Pattern searching in an enhanced suffix array takes O(m|Σ|) time while pattern searching in a suffix tree takes O(m log |Σ|) time where m is the length of a pattern and Σ is an alphabet. In this paper,we present linearized suffix trees which are efficient index data structures with the capabilities of both suffix trees and suffix arrays even when the size of an alphabet is large.A linearized suffix tree has all the functionalities of the enhanced suffix array and supports the pattern search in O(m log |Σ|) time.In a different point of view,it can be considered a practical implementation of the suffix tree supporting O(in log |Σ|)-time pattern search. In addition,we also present two efficient algorithms for computing suffix links on the enhanced suffix array and the linearized suffix tree.These are the first algorithms that run in O(n) time without using the range minima query.Our experimental results show that our algorithms are faster than the previous algorithms.

机译：后缀树和后缀数组是解决字符串处理中出现的问题的基本全文索引数据结构。由于后缀树和后缀数组具有不同的功能，因此使用后缀树可以更有效地解决某些问题，而使用后缀数组可以更有效地解决一些问题。我们认为高效的索引数据结构具有后缀树和后缀数组的功能，并且不需要太多空间。当字母的大小较小时，增强的后缀数组就是这样的索引数据结构。但是，当字母的大小较大时，增强的后缀数组失去后缀树的功能。在增强的后缀数组中进行模式搜索需要O（m |Σ|）时间，而在后缀树中进行模式搜索则需要O（m log |Σ|）时间，其中m是a的长度。模式和Σ是字母。本文提出了线性化后缀树，它是高效的索引数据结构，即使在字母表大小较大的情况下也具有后缀树和后缀数组的功能。线性化后缀树具有增强后缀数组的所有功能并支持从另一角度来看，可以认为后缀树是支持O（log |Σ|）时间模式搜索的实际实现。此外，我们还提出了两种有效的算法来计算增强后缀数组和线性化后缀树上的后缀链接。这是在O（n）时间内运行的第一个不使用范围最小值查询的算法。我们的实验结果表明，算法比以前的算法快。

著录项

来源
《Algorithmica》 |2008年第3期|p.350-377|共28页
作者
Dong Kyue Kim; Minhwan Kim; Heejin Park;
展开▼
作者单位

Hanyang University,Seoul 133-791,South Korea;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
suffix trees; suffix arrays; index data structures; string algorithms;

机译：后缀树;后缀数组;索引数据结构;字符串算法;
入库时间 2022-08-17 13:20:11

相似文献

外文文献
中文文献
专利

1. Linearized Suffix Tree: an Efficient Index Data Structure with the Capabilities of Suffix Trees and Suffix Arrays [J] . Dong Kyue Kim, Minhwan Kim, Heejin Park Algorithmica . 2008,第3期

机译：线性化后缀树：具有后缀树和后缀数组功能的高效索引数据结构
2. Compressed suffix arrays and suffix trees with applications to text indexing and string matching [J] . Grossi R, Vitter JS SIAM Journal on Computing . 2005,第2期

机译：压缩后缀数组和后缀树及其在文本索引和字符串匹配中的应用
3. Toward optimizing the cache performance of suffix trees for sequence analysis algorithms suffix tree cache performance optimization. [J] . Lee C, Huang CH Advances in Experimental Medicine and Biology . 2010,第Null期

机译：为了优化后缀树的缓存性能，以进行序列分析算法后缀树的缓存性能优化。
4. An Efficient Index Data Structure with the Capabilities of Suffix Trees and Suffix Arrays for Alphabets of Non-negligible Size [C] . Dong Kyue Kim, Jeong Eun Jeon, Heejin Park International Conference on String Processing and Information Retrieval(SPIRE 2004); 20041005-08; Padova(IT) . 2004

机译：具有大小不可忽略的字母后缀树和后缀数组的高效索引数据结构
5. Suffix trees and suffix arrays in primary and secondary storage [D] . Ko, Pang 2007

机译：主存储和辅助存储中的后缀树和后缀数组
6. Using the Sadakane Compressed Suffix Tree to Solve the All-Pairs Suffix-Prefix Problem [O] . Maan Haj Rachid, Qutaibah Malluhi, Mohamed Abouelhoda -1

机译：使用Sadakane压缩后缀树来解决全对后缀前缀问题
7. Suffix Cactus: A Cross between Suffix Tree and Suffix Array [O] . Juha Kärkkäinen 1995

机译：后缀仙人掌：后缀树和后缀数组之间的交叉

Linearized Suffix Tree: An Efficient Index Data Structure With The Capabilities Of Suffix Trees And Suffix Arrays

摘要

著录项

相似文献

相关主题

期刊订阅