首页> 外文期刊>Bioinformatics >Disk-based compression of data from genome sequencing
【24h】

Disk-based compression of data from genome sequencing

机译:基于磁盘的基因组测序数据压缩

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: High-coverage sequencing data have significant, yet hard to exploit, redundancy. Most FASTQ compressors cannot efficiently compress the DNA stream of large datasets, since the redundancy between overlapping reads cannot be easily captured in the (relatively small) main memory. More interesting solutions for this problem are disk based, where the better of these two, from Cox et al. (2012), is based on the Burrows-Wheeler transform (BWT) and achieves 0.518 bits per base for a 134.0 Gbp human genome sequencing collection with almost 45-fold coverage.
机译:动机:高覆盖率测序数据具有大量但又难以利用的冗余。大多数FASTQ压缩器无法有效地压缩大型数据集的DNA流,因为重叠读取之间的冗余无法轻易地捕获在(相对较小的)主存储器中。针对此问题的更有趣的解决方案是基于磁盘的,这两种方法中最好的一种来自Cox等。 (2012年)基于Burrows-Wheeler变换(BWT),对于134.0 Gbp的人类基因组测序集合,其碱基几乎达到0.518位,覆盖率几乎达到45倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号