首页> 外文期刊>Tsinghua Science and Technology >A survey of bitmap index compression algorithms for Big Data
【24h】

A survey of bitmap index compression algorithms for Big Data

机译:大数据位图索引压缩算法研究

获取原文
获取原文并翻译 | 示例
           

摘要

With the growing popularity of Internet applications and the widespread use of mobile Internet, Internet traffic has maintained rapid growth over the past two decades. Internet Traffic Archival Systems (ITAS) for packets or flow records have become more and more widely used in network monitoring, network troubleshooting, and user behavior and experience analysis. Among the three key technologies in ITAS, we focus on bitmap index compression algorithm and give a detailed survey in this paper. The current state-of-the-art bitmap index encoding schemes include: BBC,WAH, PLWAH, EWAH, PWAH, CONCISE, COMPAX, VLC, DF-WAH, and VAL-WAH. Based on differences in segmentation, chunking, merge compress, and Near Identical (NI) features, we provide a thorough categorization of the state-of-the-art bitmap index compression algorithms. We also propose some new bitmap index encoding algorithms, such as SECOMPAX, ICX, MASC, and PLWAH+, and present the state diagrams for their encoding algorithms. We then evaluate their CPU and GPU implementations with a real Internet trace from CAIDA. Finally, we summarize and discuss the future direction of bitmap index compression algorithms. Beyond the application in network security and network forensic, bitmap index compression with faster bitwise-logical operations and reduced search space is widely used in analysis in genome data, geographical information system, graph databases, image retrieval, Internet of things, etc. It is expected that bitmap index compression will thrive and be prosperous again in Big Data era since 1980s.
机译:随着Internet应用程序的日益普及和移动Internet的广泛使用,过去20年中Internet流量一直保持快速增长。用于数据包或流记录的Internet流量归档系统(ITAS)已越来越广泛地用于网络监视,网络故障排除以及用户行为和体验分析。在ITAS的三项关键技术中,我们重点研究位图索引压缩算法,并在本文中进行了详细的调查。当前最新的位图索引编码方案包括:BBC,WAH,PLWAH,EWAH,PWAH,CONCISE,COMPAX,VLC,DF-WAH和VAL-WAH。基于分段,分块,合并压缩和近乎相同(NI)功能的差异,我们对最先进的位图索引压缩算法进行了彻底的分类。我们还提出了一些新的位图索引编码算法,例如SECOMPAX,ICX,MASC和PLWAH +,并给出了其编码算法的状态图。然后,我们使用来自CAIDA的真实Internet跟踪评估它们的CPU和GPU实现。最后,我们总结并讨论了位图索引压缩算法的未来方向。除了在网络安全和网络取证中的应用外,具有更快的按位逻辑运算和减少的搜索空间的位图索引压缩还广泛用于基因组数据,地理信息系统,图形数据库,图像检索,物联网等的分析中。期望自1980年代以来,位图索引压缩将在大数据时代蓬勃发展并再次繁荣。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号