【24h】

On-Line Linear-Time Construction of Word Suffix Trees

机译:词后缀树的在线线性时间构造

获取原文
获取原文并翻译 | 示例

摘要

Suffix trees are the key data structure for text string matching, and are used in wide application areas such as bioinformatics and data compression. Sparse suffix trees are kind of suffix trees that represent only a subset of suffixes of the input string. In this paper we study word suffix trees, which are one variation of sparse suffix trees. Let D be a dictionary of words and w be a string in D~+, namely, w is a sequence w_1 • • • w_k of k words in D. The word suffix tree of w w.r.t. D is a path-compressed trie that represents only the k suffixes in the form of w_i • • • w_k • A typical example of its application is word- and phrase-level search on natural language documents. Andersson et al. proposed an algorithm to build word suffix trees in O(n) expected time with O(k) space. In this paper we present a new word suffix tree construction algorithm with O(n) running time and O(k) space in the worst cases. Our algorithm is on-line, which means that it can sequentially process the characters in the input, each by each, from left to right.
机译:后缀树是用于文本字符串匹配的关键数据结构,并在诸如生物信息学和数据压缩之类的广泛应用领域中使用。稀疏后缀树是一种后缀树,仅表示输入字符串的后缀的子集。本文研究词后缀树,它是稀疏后缀树的一种变体。令D为单词词典,w为D〜+中的字符串,即w为D中k个单词的序列w_1•••w_k。w w.r.t的单词后缀树。 D是路径压缩的特里,仅表示形式为w_i•••w_k•的k个后缀。其典型应用示例是在自然语言文档上进行单词和短语级搜索。安德森(Andersson)等人。提出了一种在O(k)空间的O(n)期望时间内构建单词后缀树的算法。在本文中,我们提出了一种在最坏情况下具有O(n)运行时间和O(k)空间的新词后缀树构造算法。我们的算法是在线的,这意味着它可以按从左到右的顺序依次处理输入中的字符。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号