首页> 外文会议>IEEE Conference on Local Computer Networks >BAH: A Bitmap Index Compression Algorithm for Fast Data Retrieval
【24h】

BAH: A Bitmap Index Compression Algorithm for Fast Data Retrieval

机译:BAH:一种用于快速数据检索的位图索引压缩算法

获取原文

摘要

Efficient retrieval of traffic archival data is a must-have technique to detect network attacks, such as APT(advanced persistent threat) attack. In order to take insight from Internet traffic, the bitmap index is increasingly used for efficiently querying over large datasets. However, a raw bitmap index leads to high space consumption and overhead on loading indexes. Various bitmap index compression algorithms are proposed to save storage while improving query efficiency. This paper proposes a new bitmap index compression algorithm called BAH (Byte Aligned Hybrid compression coding). An acceleration algorithm using SIMD is designed to increase the efficiency of AND operation over multiple compressed bitmaps. In all, BAH has a better compression ratio and faster intersection querying speed compared with several previous works such as WAH, PLWAH, COMPAX, Roaring etc. The theoretical analysis shows that the space required by BAH is no larger than 1.6 times the information entropy of the bitmap with density larger than 0.2%. In the experiments, BAH saves about 65% space and 60% space compared with WAH on two datasets. The experiments also demonstrate the query efficiency of BAH with the application in Internet Traffic and Web pages.
机译:高效检索流量存档数据是检测网络攻击(如APT(高级持续性威胁)攻击)的必备技术。为了从Internet流量中获取洞察力,位图索引越来越多地用于对大型数据集进行有效查询。但是,原始位图索引会导致较高的空间消耗和加载索引的开销。提出了各种位图索引压缩算法,以节省存储空间,同时提高查询效率。本文提出了一种新的位图索引压缩算法,称为BAH(字节对齐混合压缩编码)。设计了一种使用SIMD的加速算法,以提高多个压缩位图上AND运算的效率。总体而言,BAH与WAH,PLWAH,COMPAX,Roaring等以前的几种作品相比,具有更好的压缩率和更快的交点查询速度。理论分析表明,BAH所需的空间不大于信息熵的1.6倍。位图的密度大于0.2%。在实验中,与两个数据集上的WAH相比,BAH分别节省了约65%的空间和60%的空间。实验还证明了Internet访问和网页中BAH的查询效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号