首页> 外国专利> Data transformation of Cassandra files for improved deduplication during backup

Data transformation of Cassandra files for improved deduplication during backup

机译:Cassandra文件的数据转换可改善备份期间的重复数据删除

摘要

Cassandra SSTable data is transformed to provide data rows that are a consistent size such that data in each row has a length that is contained within a selected fixed sized kilobyte segment for deduplication. Tables of a Cassandra cluster node are translated in parallel to JSON format using Cassandra SSTableDump and the table rows are parsed to provide data rows corresponding to the data in each table row. Each row of data is padded with a predictable pattern of bits such that the data row has a length corresponding to the selected fixed segment size and has boundary locations that correspond to multiple of the selected segment size. Since each row of data starts on a segment boundary, duplicate rows of data will be identified wherever they move within a table.
机译:转换Cassandra SSTable数据以提供大小一致的数据行,以使每一行中的数据的长度都包含在选定的固定大小的千字节段中,以进行重复数据删除。使用Cassandra SSTableDump将Cassandra群集节点的表并行转换为JSON格式,并解析表行以提供与每个表行中的数据相对应的数据行。数据的每一行都填充有可预测的位模式,以使数据行的长度与所选固定段大小相对应,并且边界位置与所选段大小的倍数相对应。由于每行数据都始于段边界,因此,只要重复的数据行在表中的任何位置,都将被标识出来。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号