首页> 外国专利> System and method for unorchestrated determination of data sequences using sticky byte factoring to determine breakpoints in digital sequences

System and method for unorchestrated determination of data sequences using sticky byte factoring to determine breakpoints in digital sequences

机译:使用粘性字节分解来确定数字序列中的断点的无序确定数据序列的系统和方法

摘要

A system and method for unorchestrated determination of data sequences using “sticky byte” factoring to determine breakpoints in digital sequences such that common sequences can be identified. Sticky byte factoring provides an efficient method of dividing a data set into pieces that generally yields near optimal commonality. This is effectuated by employing a rolling hashsum and, in an exemplary embodiment disclosed herein, a threshold function to deterministically set divisions in a sequence of data. Both the rolling hash and the threshold function are designed to require minimal computation. This low overhead makes it possible to rapidly partition a data sequence for presentation to a factoring engine or other applications that prefer subsequent synchronization across the data set.
机译:一种用于使用“粘性字节”分解来无序确定数据序列的系统和方法,以确定数字序列中的断点,以便可以识别公共序列。粘性字节分解提供了一种有效的方法,可将数据集分为多个部分,通常可以产生接近最佳的通用性。这通过采用滚动哈希和在本文公开的示例性实施例中采用阈值函数来确定性地设置数据序列中的划分来实现。滚动哈希和阈值函数均设计为需要最少的计算。这种低开销使得可以快速划分数据序列以将其呈现给分解引擎或其他需要跨数据集进行同步的应用程序。

著录项

  • 公开/公告号US7272602B2

    专利类型

  • 公开/公告日2007-09-18

    原文格式PDF

  • 申请/专利权人 GREGORY HAGAN MOULTON;

    申请/专利号US20040861796

  • 发明设计人 GREGORY HAGAN MOULTON;

    申请日2004-06-04

  • 分类号G06F17/30;G06F7/00;

  • 国家 US

  • 入库时间 2022-08-21 21:02:41

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号