首页> 外文期刊>Theoretical computer science >Property matching and weighted matching
【24h】

Property matching and weighted matching

机译:属性匹配和加权匹配

获取原文
获取原文并翻译 | 示例
       

摘要

In many pattern matching applications the text has some properties attached to its various parts. Pattern Matching with Properties (Property Matching, for short), involves a string matching between the pattern and the text, and the requirement that the text part satisfies some property. Some immediate examples come from molecular biology where it has long been a practice to consider special areas in the genome by their structures. It is straightforward to do sequential matching in a text with properties. However, indexing in a text with properties becomes difficult if we desire the time to be output dependent. We present an algorithm for indexing a text with properties in O(n log |∑| + n log log n) time for preprocessing and O(|P| log |∑| +tocc{sub}π) per query, where n is the length of the text, P is the sought pattern, ∑ is the alphabet, and tocc{sub}π is the number of occurrences of the pattern that satisfy some property π.As a practical use of Property Matching we show how to solve Weighted Matching problems using techniques from Property Matching. Weighted sequences have recently been introduced as a tool to handle a set of sequences that are not identical but have many local similarities. The weighted sequence is a "statistical image" of this set, where we are given the probability of every symbol's occurrence at every text location. Weighted matching problems are pattern matching problems where the given text is weighted. We present a reduction from Weighted Matching to Property Matching that allows off-the-shelf solutions to numerous weighted matching problems including indexing, swapped matching, parameterized matching, approximate matching, and many more. Assuming that one seeks the occurrence of pattern P with probability ∈ in weighted text T of length n, we reduce the problem to a property matching problem of pattern P in text T' of length O (n (1/∈){sup}2 log 1/∈).
机译:在许多模式匹配应用程序中,文本的各个部分都有一些属性。具有属性的模式匹配(简称“属性匹配”)涉及模式与文本之间的字符串匹配,以及文本部分满足某些属性的要求。一些直接的例子来自分子生物学,长期以来人们一直习惯通过基因组的结构来考虑基因组中的特殊区域。在具有属性的文本中进行顺序匹配很简单。但是,如果我们希望时间取决于输出,则很难在具有属性的文本中建立索引。我们提出一种索引文本的算法,该文本具有O(n log | ∑ | + n log log n)时间的预处理属性和每个查询O(| P | log | ∑ | + tocc {sub}π)的属性,其中n为文本的长度,P是所需的模式,∑是字母,tocc {sub}π是满足某些属性π的模式出现的次数。作为属性匹配的实际用法,我们展示了如何解决加权问题使用属性匹配中的技术来匹配问题。最近引入了加权序列作为处理一组不相同但具有许多局部相似性的序列的工具。加权序列是该集合的“统计图像”,在这里我们得到每个符号在每个文本位置出现的概率。加权匹配问题是对给定文本进行加权的模式匹配问题。我们提出了从“加权匹配”到“属性匹配”的简化,它允许现成的解决方案来解决许多加权匹配问题,包括索引,交换匹配,参数化匹配,近似匹配等等。假设有人在长度为n的加权文本T中以概率ε寻找模式P的出现,我们将该问题简化为长度为O(n(1 /∈){sup} 2 log 1 /∈)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号