首页> 外国专利> Efficient column based data encoding for large-scale data storage

Efficient column based data encoding for large-scale data storage

机译:高效的基于列的数据编码,用于大规模数据存储

摘要

The subject disclosure relates to column based data encoding where raw data to be compressed is organized by columns, and then, as first and second layers of reduction of the data size, dictionary encoding and/or value encoding are applied to the data as organized by columns, to create integer sequences that correspond to the columns. Next, a hybrid greedy run length encoding and bit packing compression algorithm further compacts the data according to an analysis of bit savings. Synergy of the hybrid data reduction techniques in concert with the column-based organization, coupled with gains in scanning and querying efficiency owing to the representation of the compact data, results in substantially improved data compression at a fraction of the cost of conventional systems.
机译:本主题公开涉及基于列的数据编码,其中将要压缩的原始数据按列进行组织,然后,当减小数据大小的第一和第二层时,将字典编码和/或值编码应用于由列,以创建与列相对应的整数序列。接下来,根据对位节省的分析,混合贪婪游程长度编码和位打包压缩算法会进一步压缩数据。混合数据精简技术与基于列的组织协同工作,再加上由于紧凑数据的表示而带来的扫描和查询效率的提高,从而大大改善了数据压缩,而成本仅为传统系统的一小部分。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号