...
首页> 外文期刊>Bioinformatics >BamHash: a checksum program for verifying the integrity of sequence data
【24h】

BamHash: a checksum program for verifying the integrity of sequence data

机译:BamHash:一个校验和程序,用于验证序列数据的完整性

获取原文
获取原文并翻译 | 示例
           

摘要

Large resequencing projects require a significant amount of storage for raw sequences, as well as alignment files. Because the raw sequences are redundant once the alignment has been generated, it is possible to keep only the alignment files. We present BamHash, a checksum based method to ensure that the read pairs in FASTQ files match exactly the read pairs stored in BAM files, regardless of the ordering of reads. BamHash can be used to verify the integrity of the files stored and discover any discrepancies. Thus, BamHash can be used to determine if it is safe to delete the FASTQ files storing raw sequencing read after alignment, without the loss of data.
机译:大型重测序项目需要大量存储原始序列以及比对文件。因为一旦生成比对,原始序列就是多余的,因此可以仅保留比对文件。我们提出BamHash,一种基于校验和的方法,以确保FASTQ文件中的读取对与存储在BAM文件中的读取对完全匹配,而与读取的顺序无关。 BamHash可用于验证所存储文件的完整性并发现任何差异。因此,BamHash可用于确定删除存储在比对后读取的原始测序的FASTQ文件是否安全,而不会丢失数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号