首页> 外国专利> Compression of genomic data file

Compression of genomic data file

机译:基因组数据文件的压缩

摘要

Systems and methods for compression of a genomic data file are described herein. In one embodiment, genomic sequences, sequence headers, and quality sequences associated with a plurality of data streams provided in a genomic data file are identified. Each of the genomic sequences includes at least one of primary characters and secondary characters. Further, the secondary characters from each of the genomic sequences may be removed to obtain an intermediate genomic sequence file and a quality score corresponding to the secondary character may be modified in quality sequences to obtain an intermediate quality sequence file. Based on the intermediate genomic sequence file and the intermediate quality sequence file, a modified genomic sequence file and a modified quality sequence file, respectively are generated. A compressed genomic data file is obtained using at least the modified genomic sequence and the modified quality sequence.
机译:本文描述了用于压缩基因组数据文件的系统和方法。在一个实施例中,识别与基因组数据文件中提供的多个数据流相关的基因组序列,序列报头和质量序列。每个基因组序列包括主要字符和次要字符中的至少一个。此外,可以去除来自每个基因组序列的次要字符以获得中间基因组序列文件,并且可以在质量序列中修改与该次要字符相对应的质量得分以获得中间质量序列文件。基于中间基因组序列文件和中间质量序列文件,分别生成修改的基因组序列文件和修改的质量序列文件。至少使用修饰的基因组序列和修饰的质量序列获得压缩的基因组数据文件。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号