Small Subset Queries and Bloom Filters Using Ternary Associative Memories with Applications

机译：使用三元关联内存和应用程序的小型子集查询和Bloom过滤器

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Associative memories offer high levels of parallelism in matching a query against stored entries. We design and analyze an architecture which uses a single lookup into a Ternary Content Addressable Memory (TCAM) to solve the subset query problem for small sets, i.e., to check whether a given set (the query) contains (or alternately, is contained in) any one of a large collection of sets in a database. We use each TCAM entry as a small Ternary Bloom Filter (each 'bit' of which is one of {0,1,"*" }) to store one of the sets in the collection. Like Bloom filters, our architecture is susceptible to false positives. Since each TCAM entry is quite small, asymptotic analyses of Bloom filters do not directly apply. Surprisingly, we are able to show that the asymptotic false positive probability formula can be safely used if we penalize the small Bloom filter by taking away just one bit of storage and adding just half an extra set element before applying the formula. We believe that this analysis is independently interesting. The subset query problem has applications in databases, network intrusion detection, packet classification in Internet routers, and Information Retrieval. We demonstrate our architecture on one illustrative streaming application - intrusion detection in network traffic. By shingling (i.e., taking consecutive bytes of) the strings in the database, we can perform a single subset query, and hence a single TCAM search, to skip many bytes in the stream. We evaluate our scheme on the open source CLAM anti-virus database, for worst-case as well as random streams. Our architecture appears to be at least one order of magnitude faster than previous approaches. Since the individual Bloom filters must fit in a single TCAM entry (currently 72 to 576 bits), our solution applies only when each set is of a small cardinality. However, this is sufficient for many typical applications. Also, recent algorithms for the subset-query problem use a small-set version as a subroutine.

机译：在将查询与存储的条目进行匹配时，关联存储器提供了高水平的并行性。我们设计和分析一种架构，该架构使用单查询到三元内容可寻址内存（TCAM）中来解决小集合的子集查询问题，即检查给定集合（查询）是否包含（或替代地）包含在其中。）数据库中大量集合中的任何集合。我们将每个TCAM条目用作一个小的三元Bloom过滤器（每个“位”为{0,1，“ *”}之一）将其中一组存储在集合中。像Bloom过滤器一样，我们的体系结构容易受到误报的影响。由于每个TCAM条目都非常小，因此不直接应用Bloom滤波器的渐近分析。出乎意料的是，我们能够证明，如果我们在应用公式之前仅占用一点存储空间并仅添加一半的额外设置元素来惩罚小的Bloom过滤器，则可以安全地使用渐近误报概率公式。我们认为这种分析是独立有趣的。子集查询问题在数据库，网络入侵检测，Internet路由器中的数据包分类和信息检索中都有应用。我们在一个说明性的流应用程序上演示了我们的体系结构-网络流量中的入侵检测。通过将数据库中的字符串混合在一起（即获取连续字节），我们可以执行单个子集查询，从而执行单个TCAM搜索，以跳过流中的许多字节。我们在开源CLAM防病毒数据库上评估我们的方案，以应对最坏情况以及随机流。我们的体系结构似乎比以前的方法快至少一个数量级。由于各个Bloom过滤器必须适合单个TCAM条目（当前为72到576位），因此我们的解决方案仅在每组基数较小时适用。但是，这对于许多典型应用程序已经足够。同样，用于子集查询问题的最新算法使用小版本版本作为子例程。

著录项

来源
《International conference on measurement and modeling of computer systems 2010》|2010年|p.143-154|共12页
会议地点 New York NY(US);New York NY(US)
作者
Ashish Goel; Pankaj Gupta;
展开▼
作者单位

Stanford University Stanford, California;

Twitter, Inc. San Francisco, California;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
subset queries; bloom filters; TCAM;

机译：子集查询；布隆过滤器； TCAM;

相似文献

外文文献
中文文献
专利

1. Small Subset Queries and Bloom Filters Using Ternary Associative Memories, with Applications [J] . Ashish Goel, rnPankaj Gupta Performance evaluation review . 2010,第1期

机译：使用三元关联内存的小型子集查询和Bloom过滤器及其应用
2. An Insight Review on Bloom Filter and Its Variants with Applications:An Emerging Hash Based Membership Querying Technique [J] . K. Saravanan, A. Senthilkumar Journal of Discrete Mathematical Sciences and Cryptography . 2015,第3期

机译：Bloom过滤器及其应用变体的洞察力回顾：一种基于哈希的新兴成员查询技术
3. Efficient and robust associative memory from a generalized Bloom filter [J] . Sterne P. Biological Cybernetics: Communication and Control in Organisms and Automata: = Nachrichtenubertragung, Nachrichtenverarbeitung, Steuerung und Regelung in Organismen und in Automaten . 2012,第4a5期

机译：来自广义Bloom过滤器的高效强大的关联存储器
4. Small Subset Queries and Bloom Filters Using Ternary Associative Memories with Applications [C] . Ashish Goel, Pankaj Gupta International conference on measurement and modeling of computer systems . 2010

机译：使用带有应用程序的三元关联存储器的小型子集查询和绽放过滤器
5. Improving the performance of parallel SPARQL query processing on Apache Spark using Bloom filters. [D] . Kolla, Venubabu. 2016

机译：使用Bloom过滤器提高Apache Spark上并行SPARQL查询处理的性能。
6. Using cascading Bloom filters to improve the memory usage for de Brujin graphs [O] . Kamil Salikhov, Gustavo Sacomoto, Gregory Kucherov 2014

机译：使用级联Bloom过滤器来改善de Bruijn图的内存使用量
7. Small subset queries and Bloom filters using ternary associative memories, with applications [O] . Ashish Goel, Pankaj Gupta 2010

机译：使用三元关联存储器和应用程序的小子集查询和布隆过滤器
8. Practical Applications of Bloom Filters to the NIST RDS and Hard Drive Triage; Conference paper [R] . Farrell, P., Garfinkel, S. L., White, D. 2008

机译：Bloom过滤器在NIsT RDs和硬盘驱动器分类中的实际应用;会议文件

Small Subset Queries and Bloom Filters Using Ternary Associative Memories with Applications

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅