首页> 外文会议>International conference on very large data bases >SigMatch: Fast and Scalable Multi-Pattern Matching
【24h】

SigMatch: Fast and Scalable Multi-Pattern Matching

机译:sigmatch:快速且可扩展的多模式匹配

获取原文

摘要

Multi-pattern matching involves matching a data item against a large database of "signature" patterns. Existing algorithms for multi-pattern matching do not scale well as the size of the signature database increases. In this paper, we present sigMatch - a fast, versatile, and scalable technique for multi-pattern signature matching. At its heart, sigMatch organizes the signature database into a (processor) cache-efficient q-gram index structure, called the sigTree. The sigTree groups patterns based on common sub-patterns, such that signatures that don't match can be quickly eliminated from the matching process. The sigTree also uses parallel Bloom filters and a technique to reduce imbalances across groups, for improved performance. Using extensive empirical evaluation across three diverse domains, we show that sigMatch often outperforms existing methods by an order of magnitude or more.
机译:多模式匹配涉及将数据项与大型“签名”模式进行匹配。对于多模式匹配的现有算法不符号,因为签名数据库的大小增加。在本文中,我们呈现Sigmatch - 一种快速,多功能,可扩展的多模式签名匹配技术。 Sigmatch在它的心脏,将签名数据库组织成(处理器)高速缓存高效的Q-Gr索引结构,称为Sigtree。基于常见子模式的Sigtree组模式,使得不匹配的签名可以从匹配过程中快速消除。 Sigtree还使用并行绽放过滤器和一种技术来减少跨组的不平衡,以提高性能。在三个不同的域中使用广泛的实证评估,我们表明Sigmatch通常以数量级或更多的顺序优于现有方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号