首页> 外文会议>International conference on very large data bases;VLDB 2010 >SigMatch: Fast and Scalable Multi-Pattern Matching
【24h】

SigMatch: Fast and Scalable Multi-Pattern Matching

机译:SigMatch:快速且可扩展的多模式匹配

获取原文

摘要

Multi-pattern matching involves matching a data item against a large database of "signature" patterns. Existing algorithms for multi-pattern matching do not scale well as the size of the signature database increases. In this paper, we present sigMatch - a fast, versatile, and scalable technique for multi-pattern signature matching. At its heart, sigMatch organizes the signature database into a (processor) cache-efficient q-gram index structure, called the sigTree. The sigTree groups patterns based on common sub-patterns, such that signatures that don't match can be quickly eliminated from the matching process. The sigTree also uses parallel Bloom filters and a technique to reduce imbalances across groups, for improved performance. Using extensive empirical evaluation across three diverse domains, we show that sigMatch often outperforms existing methods by an order of magnitude or more.
机译:多模式匹配涉及将数据项与“签名”模式的大型数据库进行匹配。随着签名数据库的大小增加,用于多模式匹配的现有算法无法很好地扩展。在本文中,我们提出了sigMatch-一种用于多模式签名匹配的快速,通用和可扩展的技术。 sigMatch的核心是将签名数据库组织成一个(处理器)高效缓存的q-gram索引结构,称为sigTree。 sigTree根据常见的子模式对模式进行分组,以便可以从匹配过程中快速消除不匹配的签名。 sigTree还使用并行布隆过滤器和减少组之间不平衡的技术,以提高性能。通过对三个不同领域的广泛经验评估,我们显示sigMatch通常比现有方法的性能高一个数量级或更多。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号