首页> 外国专利> LARGE DATA SET NEGATIVE INFORMATION STORAGE MODEL

LARGE DATA SET NEGATIVE INFORMATION STORAGE MODEL

机译:大数据集负信息存储模型

摘要

Systems and methods for storing large data sets, such as genetic sequence information. Within a "targeted subset" of positions with information, the system stores, both variant states and missing states at each position. Reference states are not stored, but are inferred within the targeted subset when neither a variant nor a missing state is stored at a given position. The absence of a variant state at a given position is assumed to be a reference state. The criteria for missing data are defined in pre-processing and are customizable based on the use case. For example, each data point may represent the genetic information of a sample at a position in the genome. The targeted subset may represent those positions that were included in a sequencing test.
机译:用于存储大数据集(例如遗传序列信息)的系统和方法。在具有信息的位置的“目标子集”内,系统在每个位置存储变体状态和缺失状态。参考状态不会存储,但是当变体或缺失状态都没有存储在给定位置时,可以在目标子集中推断参考状态。在给定位置不存在变化状态被认为是参考状态。丢失数据的标准在预处理中定义,并且可以根据用例进行自定义。例如,每个数据点可以代表基因组中某个位置的样品的遗传信息。目标子集可以代表测序测试中包括的那些位置。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号