首页> 外国专利> CONTENT-BASED SEGMENTATION SCHEME FOR DATA COMPRESSION IN STORAGE AND TRANSMISSION INCLUDING HIERARCHICAL SEGMENT REPRESENTATION

CONTENT-BASED SEGMENTATION SCHEME FOR DATA COMPRESSION IN STORAGE AND TRANSMISSION INCLUDING HIERARCHICAL SEGMENT REPRESENTATION

机译:基于内容的分段策略,用于存储和传输中的数据压缩,包括分层分段表示

摘要

In a coding system, input data within a system is encoded. The input data might include sequences of symbols that repeat in the input data or occur in other input data encoded in the system. The encoding includes determining a target segment size, determining a window size, identifying a fingerprint within a window of symbols at an offset in the input data, determining whether the offset is to be designated as a cut point and segmenting the input data as indicated by the set of cut points. For each segment so identified, the encoder determines whether the segment is to be a referenced segment or an unreferenced segment, replacing the segment data of each referenced segment with a reference label and storing a reference binding in a persistent segment store for each referenced segment, if needed. Hierarchically, the process can be repeated by grouping references into groups, replacing the grouped references with a group label, storing a binding between the grouped references and group label, if one is not already present, and repeating the process. The number of levels of hierarchy can be fixed in advanced or it can be determined from the content encoded.
机译:在编码系统中,系统内的输入数据被编码。输入数据可能包括在输入数据中重复出现或在系统中编码的其他输入数据中出现的符号序列。编码包括确定目标段大小,确定窗口大小,在输入数据中的偏移处标识符号窗口内的指纹,确定是否将偏移指定为切点并按如下所示分割输入数据切割点集。对于这样标识的每个段,编码器确定该段是参考段还是未参考段,将每个参考段的段数据替换为参考标签,并将参考绑定存储在每个参考段的持久段存储中,如果需要的话。从层次上讲,可以通过以下方式重复该过程:将引用分为几组,用组标签替换分组的引用,存储分组的引用和组标签之间的绑定(如果尚不存在),然后重复该过程。层次结构的数量可以预先固定,也可以根据编码的内容确定。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号