首页> 外文期刊>International Journal of Bio-Inspired Computation >A comparative study of cuckoo search and bat algorithm for Bloom filter optimisation in spam filtering
【24h】

A comparative study of cuckoo search and bat algorithm for Bloom filter optimisation in spam filtering

机译:杜鹃搜索和蝙蝠算法对垃圾邮件过滤中Bloom Bloom优化的比较研究

获取原文
获取原文并翻译 | 示例
       

摘要

Bloom filter (BF) is a simple but powerful data structure that can check membership to a static set. The trade-off to use Bloom filter is a certain configurable risk of false positives. The odds of a false positive can be made very low if the hash bitmap is sufficiently large. Spam is an irrelevant or inappropriate message sent on the internet to a large number of newsgroups or users. A spam word is a list of well-known words that often appear in spam mails. The proposed system of bin Bloom filter (BBF) groups the words into number of bins with different false positive rates based on the weights of the spam words. Cuckoo search (CS) and bat algorithm are bio-inspired algorithms that imitate the way cuckoo breeding and microbat foraging behaviours respectively. This paper demonstrates the CS and bat algorithm for minimising the total membership invalidation cost of the BBFs by finding the optimal false positive rates and number of elements stored in every bin. The experimental results demonstrate the application of CS and bat algorithm for various numbers of bins and strings.
机译:布隆过滤器(BF)是一个简单但功能强大的数据结构,可以检查静态集合的成员身份。使用布隆过滤器的权衡是误报的某些可配置风险。如果散列位图足够大,则可以使误报的几率非常低。垃圾邮件是互联网上发送给大量新闻组或用户的无关紧要或不适当的消息。垃圾邮件词是在垃圾邮件中经常出现的知名词列表。所提出的bin Bloom过滤器(BBF)系统根据垃圾邮件单词的权重将单词分为具有不同误报率的bin数量。杜鹃搜索(CS)和蝙蝠算法是受生物启发的算法,它们分别模仿杜鹃的繁殖方式和微型蝙蝠的觅食行为。本文演示了CS和bat算法,该算法通过找到最佳的误报率和每个仓中存储的元素数量,来最小化BBF的总成员资格失效成本。实验结果证明了CS和bat算法在各种数量的箱和琴弦中的应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号