首页> 外文会议>International conference on measurement and modeling of computer systems 2010 >Small Subset Queries and Bloom Filters Using Ternary Associative Memories with Applications
【24h】

Small Subset Queries and Bloom Filters Using Ternary Associative Memories with Applications

机译:使用三元关联内存和应用程序的小型子集查询和Bloom过滤器

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Associative memories offer high levels of parallelism in matching a query against stored entries. We design and analyze an architecture which uses a single lookup into a Ternary Content Addressable Memory (TCAM) to solve the subset query problem for small sets, i.e., to check whether a given set (the query) contains (or alternately, is contained in) any one of a large collection of sets in a database. We use each TCAM entry as a small Ternary Bloom Filter (each 'bit' of which is one of {0,1,"*" }) to store one of the sets in the collection. Like Bloom filters, our architecture is susceptible to false positives. Since each TCAM entry is quite small, asymptotic analyses of Bloom filters do not directly apply. Surprisingly, we are able to show that the asymptotic false positive probability formula can be safely used if we penalize the small Bloom filter by taking away just one bit of storage and adding just half an extra set element before applying the formula. We believe that this analysis is independently interesting. The subset query problem has applications in databases, network intrusion detection, packet classification in Internet routers, and Information Retrieval. We demonstrate our architecture on one illustrative streaming application - intrusion detection in network traffic. By shingling (i.e., taking consecutive bytes of) the strings in the database, we can perform a single subset query, and hence a single TCAM search, to skip many bytes in the stream. We evaluate our scheme on the open source CLAM anti-virus database, for worst-case as well as random streams. Our architecture appears to be at least one order of magnitude faster than previous approaches. Since the individual Bloom filters must fit in a single TCAM entry (currently 72 to 576 bits), our solution applies only when each set is of a small cardinality. However, this is sufficient for many typical applications. Also, recent algorithms for the subset-query problem use a small-set version as a subroutine.
机译:在将查询与存储的条目进行匹配时,关联存储器提供了高水平的并行性。我们设计和分析一种架构,该架构使用单查询到三元内容可寻址内存(TCAM)中来解决小集合的子集查询问题,即检查给定集合(查询)是否包含(或替代地)包含在其中。 )数据库中大量集合中的任何集合。我们将每个TCAM条目用作一个小的三元Bloom过滤器(每个“位”为{0,1,“ *”}之一)将其中一组存储在集合中。像Bloom过滤器一样,我们的体系结构容易受到误报的影响。由于每个TCAM条目都非常小,因此不直接应用Bloom滤波器的渐近分析。出乎意料的是,我们能够证明,如果我们在应用公式之前仅占用一点存储空间并仅添加一半的额外设置元素来惩罚小的Bloom过滤器,则可以安全地使用渐近误报概率公式。我们认为这种分析是独立有趣的。子集查询问题在数据库,网络入侵检测,Internet路由器中的数据包分类和信息检索中都有应用。我们在一个说明性的流应用程序上演示了我们的体系结构-网络流量中的入侵检测。通过将数据库中的字符串混合在一起(即获取连续字节),我们可以执行单个子集查询,从而执行单个TCAM搜索,以跳过流中的许多字节。我们在开源CLAM防病毒数据库上评估我们的方案,以应对最坏情况以及随机流。我们的体系结构似乎比以前的方法快至少一个数量级。由于各个Bloom过滤器必须适合单个TCAM条目(当前为72到576位),因此我们的解决方案仅在每组基数较小时适用。但是,这对于许多典型应用程序已经足够。同样,用于子集查询问题的最新算法使用小版本版本作为子例程。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号