首页> 外文期刊>Computers, IEEE Transactions on >Multi-Granularity Locality-Sensitive Bloom Filter
【24h】

Multi-Granularity Locality-Sensitive Bloom Filter

机译:多粒度局部敏感布隆过滤器

获取原文
获取原文并翻译 | 示例

摘要

In many applications, such as homeland security, image processing, social network, and bioinformatics, it is often required to support an approximate membership query (AMQ) to answer a question like “is an (query) object near to at least one of the objects in the given data set ?” However, existing techniques for processing AMQs require a key parameter, i.e., the distance value, to be defined in advance for the query processing. In this paper, we propose a novel filter, called multi-granularity locality-sensitive Bloom filter (MLBF), which can process AMQs with multiple distance granularities. Specifically, the MLBF is composed of two Bloom filters (BF), one is called basic multi-granularity locality-sensitive BF (BMLBF), and the other is called multi-granularity verification BF (MVBF). The BMLBF is used to store the data objects. It adopts an alignable locality-sensitive hashing (LSH) function family to support multiple granularities. The MVBF is used to reduce the false positive rate of the MLBF. The false negative rate of the MLBF is reduced by applying AND-constructions followed by an OR-construction. In addition, based on the MLBF structure, we suggest a more space-effective variant, called the MLBF*, to further reduce space cost. Theoretical analyses for estimating false positiveegative rates of the MLBF/MLBF* are given. Experiments using synthetic and real data show that the theoretical estimates are quite accurate, and the MLBF/MLBF* technique can handle AMQs with low false positive and negative rates for multiple distance granularities.
机译:在许多应用中,例如国土安全,图像处理,社交网络和生物信息学,通常需要支持近似成员资格查询(AMQ)来回答诸如“是至少一个对象附近的(查询)对象”这样的问题。给定数据集中的对象?”但是,用于处理AMQ的现有技术需要为查询处理预先定义关键参数,即距离值。在本文中,我们提出了一种新颖的过滤器,称为多粒度局部敏感布隆过滤器(MLBF),它可以处理具有多个距离粒度的AMQ。具体而言,MLBF由两个布隆过滤器(BF)组成,一个称为基本多粒度局部性敏感BF(BMLBF),另一个称为多粒度验证BF(MVBF)。 BMLBF用于存储数据对象。它采用可对齐的局部敏感哈希(LSH)函数系列来支持多种粒度。 MVBF用于减少MLBF的误报率。 MLBF的误报率通过应用“与”构造和“或”构造来减少。另外,基于MLBF结构,我们建议使用一种更节省空间的变体MLBF *,以进一步降低空间成本。给出了估计MLBF / MLBF *的假阳性/阴性率的理论分析。使用合成数据和真实数据进行的实验表明,理论估计值非常准确,并且MLBF / MLBF *技术可以处理多种距离粒度的误报率和误报率较低的AMQ。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号