首页> 外文会议>IEEE/ACM international symposium on cluster, cloud and grid computing >Cowic: A Column-Wise Independent Compression for Log Stream Analysis
【24h】

Cowic: A Column-Wise Independent Compression for Log Stream Analysis

机译:Cowic:用于列流分析的独立于列的压缩

获取原文

摘要

Nowadays massive log streams are generated from many Internet and cloud services. Storing log streams consumes a large amount of disk space and incurs high cost. Traditional compression methods can be applied to reduce storage cost, but are inefficient for log analysis, because fetching relevant log entries from compressed data often requires retrieval and decompression of large blocks of data. We propose a column-wise compression approach for well-formatted log streams, where each log entry can be independently compressed or decompressed for analysis. Specifically, we separate a log entry into several columns and compress each column with different models. We have implemented our approach as a library and integrated it into two applications, a log search system and a log joining system. Experimental results show that our compression scheme outperforms traditional compression methods for decompression times and has a competitive compression ratio. For log search, our approach achieves better query times than using traditional compression algorithms for both in-core and out-of-core cases. For joining log streams, our approach achieves the same join quality with only 30% memory of uncompressed streams.
机译:如今,大量的日志流是由许多Internet和云服务生成的。存储日志流会消耗大量的磁盘空间,并导致高昂的成本。可以应用传统的压缩方法来降低存储成本,但是对于日志分析效率不高,因为从压缩数据中获取相关日志条目通常需要对大数据块进行检索和解压缩。我们为格式合理的日志流提出了一种按列压缩的方法,其中每个日志条目都可以独立地进行压缩或解压缩以进行分析。具体来说,我们将日志条目分为几列,并使用不同的模型压缩每一列。我们已经将我们的方法实现为一个库,并将其集成到两个应用程序中,即日志搜索系统和日志连接系统。实验结果表明,我们的压缩方案在解压缩时间方面优于传统压缩方法,并且具有竞争性的压缩率。对于日志搜索,与在内核内和内核外情况下使用传统压缩算法相比,我们的方法可实现更好的查询时间。对于加入日志流,我们的方法仅使用30%的未压缩流内存即可达到相同的加入质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号