【24h】

Dictionary Matching and Indexing with Errors and Don't Cares

机译:字典匹配和索引错误且无关紧要

获取原文
获取原文并翻译 | 示例

摘要

This paper considers various flavors of the following online problem: preprocess a text or collection of strings, so that given a query string p, all matches of p with the text can be reported quickly. In this paper we consider matches in which a bounded number of mismatches are allowed, or in which a bounded number of "don't care" characters are allowed. The specific problems we look at are: indexing, in which there is a single text t, and we seek locations where p matches a substring of t; dictionary queries, in which a collection of strings is given upfront, and we seek those strings which match p in their entirety; and dictionary matching, in which a collection of strings is given upfront, and we seek those substrings of a (long) p which match an original string in its entirety. These are all instances of an all-to-all matching problem, for which we provide a single solution. The performance bounds all have a similar character. For example, for the indexing problem with n = |t| and m = |p|, the query time for k substitutions is O(m + ((c_1 log n)~k)/(k!)) + # matches), with a data structure of size O(n((c_2 log n)~k)/(k!))) and a preprocessing time of O(n((c_2 log n)~k)/(k!))), where c_1, c_2 > 1 are constants. The deterministic preprocessing assumes a weakly nonuniform RAM model; this assumption is not needed if randomization is used in the preprocessing.
机译:本文考虑了以下在线问题的各种形式:预处理文本或字符串集合,以便在给出查询字符串p的情况下,可以快速报告p与文本的所有匹配项。在本文中,我们考虑了允许有一定数量的不匹配项或允许有一定数量的“无关”字符的匹配项。我们研究的具体问题是:索引,其中只有一个文本t,我们寻找p匹配t的子字符串的位置;字典查询,其中预先给出了字符串的集合,然后我们查找与p整体匹配的那些字符串;字典匹配,其中预先给出了字符串的集合,然后我们查找(long)p的那些子字符串,这些子字符串整体上与原始字符串匹配。这些都是全面匹配问题的全部实例,为此我们提供了一个解决方案。性能界限都具有相似的特征。例如,对于n = | t |的索引问题并且m = | p |,则k次替换的查询时间为O(m +((c_1 log n)〜k)/(k!))+#个match),数据结构的大小为O(n(((c_2 log n)〜k)/(k!)))的预处理时间为O(n((c_2 log n)〜k)/(k!))),其中c_1,c_2> 1为常数。确定性预处理假定了一个弱不均匀的RAM模型。如果在预处理中使用随机化,则不需要此假设。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号