首页> 外文会议>String Processing and Information Retrieval; Lecture Notes in Computer Science; 4209 >Dotted Suffix Trees A Structure for Approximate Text Indexing
【24h】

Dotted Suffix Trees A Structure for Approximate Text Indexing

机译:点缀后缀树近似文本索引的结构

获取原文
获取原文并翻译 | 示例

摘要

In this work, the problem we address is text indexing for approximate matching. Given a text Τ which undergoes some preprocessing to generate an index, we can later query this index to identify the places where a string occurs up to a certain number of errors κ(edition distance). The indexing structure occupies space O(n log~k n) in the average case, independent of alphabet size. This structure can be used to report the existence of a match with k errors in O(3~km~(k+1)) and to report the occurrences in O(3~km~(k+1) + ed) time, where m is the length of the pattern and ed and the number of matching edit scripts. The construction of the structure has time bound by O(kN|∑|), where N is the number of nodes in the index and |∑| the alphabet size.
机译:在这项工作中,我们要解决的问题是文本索引以进行近似匹配。给定文本Τ进行了一些预处理以生成索引,我们稍后可以查询该索引以识别出现一定误差κ(编辑距离)的字符串出现的位置。在平均情况下,索引结构占用空间O(n log〜k n),与字母大小无关。该结构可用于报告O(3〜km〜(k + 1))中存在k个错误的匹配项的存在,并报告O(3〜km〜(k + 1)+ ed)时间中的匹配项,其中m是样式和ed的长度以及匹配的编辑脚本的数量。结构的构造受O(kN | ∑ |)的时间限制,其中N是索引中的节点数,| ∑ |字母大小。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号