首页> 外文会议>IEEE International Conference on Communications >MASC: A bitmap index encoding algorithm for fast data retrieval
【24h】

MASC: A bitmap index encoding algorithm for fast data retrieval

机译:MASC:用于快速数据检索的位图索引编码算法

获取原文

摘要

The fast retrieval in archival traffic data is essential for network security and forensic analysis. A bitmap index is a data structure enabling fast search over large data collections in a limited time, but the space consumption is always a problem. WAH, PLWAH and COMPAX are proposed for compressing bitmap indexes for less storage. In this paper, a new bitmap index encoding scheme, named MASC, is proposed to further improve the compression ratio without impairing the query performance. Instead of being limited to a fixed length (31 bits) in PLWAH and COMPAX, the stride size can be as long as possible to encode consecutive zero bits and nonzero bits in a more compact way. Instead of piggyback used in PLWAH, a new structure in MASC called carrier is introduced as piggyback in PLWAH only carries an individual nonzero bit. We also generalize the traditional literal word concept in PLWAH and COMPAX. The validity of MASC encoding scheme is demonstrated with the application in Internet Traffic Archival system. Based on experiments with real Internet traffic data set from CAIDA, MASC has a better compression ratio than PLWAH and COMPAX2 without the penalty in query performance.
机译:存档流量数据的快速检索对于网络安全和取证分析至关重要。位图索引是一种数据结构,可以在有限的时间内快速搜索大型数据集,但是空间消耗始终是个问题。提出了WAH,PLWAH和COMPAX用于压缩位图索引以减少存储量。本文提出了一种新的位图索引编码方案,称为MASC,以在不影响查询性能的情况下进一步提高压缩率。步幅大小可以尽可能长,以便以更紧凑的方式编码连续的零位和非零位,而不仅限于PLWAH和COMPAX中的固定长度(31位)。代替PLWAH中使用的搭载,在MASC中引入了一种称为载波的新结构,因为PLWAH中的搭载仅携带一个单独的非零位。我们还将PLWAH和COMPAX中的传统文字词概念推广。通过在Internet流量档案系统中的应用证明了MASC编码方案的有效性。基于对来自CAIDA的真实Internet流量数据集的实验,MASC的压缩率比PLWAH和COMPAX2更好,而不会降低查询性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号