首页> 外文期刊>Distributed and Parallel Databases >Cardinality estimation and dynamic length adaptation for Bloom filters
【24h】

Cardinality estimation and dynamic length adaptation for Bloom filters

机译:布隆滤波器的基数估计和动态长度自适应

获取原文
获取原文并翻译 | 示例

摘要

Bloom filters are extensively used in distributed applications, especially in distributed databases and distributed information systems, to reduce network requirements and to increase performance. In this work, we propose two novel Bloom filter features that are important for distributed databases and information systems. First, we present a new approach to encode a Bloom filter such that its length can be adapted to the cardinality of the set it represents, with negligible overhead with respect to computation and false positive probability. The proposed encoding allows for significant network savings in distributed databases, as it enables the participating nodes to optimize the length of each Bloom filter before sending it over the network, for example, when executing Bloom joins. Second, we show how to estimate the number of distinct elements in a Bloom filter, for situations where the represented set is not materialized. These situations frequently arise in distributed databases, where estimating the cardinality of the represented sets is necessary for constructing an efficient query plan. The estimation is highly accurate and comes with tight probabilistic bounds. For both features we provide a thorough probabilistic analysis and extensive experimental evaluation which confirm the effectiveness of our approaches.
机译:布隆过滤器广泛用于分布式应用程序中,尤其是在分布式数据库和分布式信息系统中,以减少网络需求并提高性能。在这项工作中,我们提出了两个新颖的Bloom过滤器功能,这些功能对于分布式数据库和信息系统很重要。首先,我们提出了一种对布隆过滤器进行编码的新方法,以使其长度可以适应它所表示的集合的基数,而在计算和错误肯定概率方面的开销却可以忽略不计。提议的编码可以在分布式数据库中节省大量网络,因为它使参与节点可以在通过网络发送每个Bloom过滤器之前(例如,执行Bloom join时)优化每个Bloom过滤器的长度。其次,我们展示了在未实现所表示集合的情况下,如何估计布隆过滤器中不同元素的数量。这些情况经常出现在分布式数据库中,在该数据库中,估计代表集的基数对于构建有效的查询计划是必需的。该估计是高度准确的,并且具有严格的概率范围。对于这两个功能,我们提供了彻底的概率分析和广泛的实验评估,证实了我们方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号