首页> 外文会议>IEEE International Conference on Bioinformatics and Biomedicine >ALL-CQS: Adaptive locality-based lossy compression of quality scores
【24h】

ALL-CQS: Adaptive locality-based lossy compression of quality scores

机译:ALL-CQS:基于自适应局部质量得分的有损压缩

获取原文

摘要

Due to the randomness and noisiness, quality scores presented in sequencing data have already comprised about 70% of the compressed storage and reached their lossless compression limit. Lossy compression of quality scores, which guarantees the performance of subsequent variant calling procedure, has been an ideal candidate and a great challenge in big genomic data analysis. Currently, state-of-the-art locality-based lossy compressor PBlock, based on the assumption that all the quality score lines should exhibit a single locality, applies identical and static locality criterion manually to smooth all the different quality score lines. However, this assumption is usually not the real case and would inevitably result in sub-optimal locality criteria in some quality score lines, which eventually leads to performance degradation of lossy compression and variant calling procedure. Therefore, on the basis of a more reasonable assumption that different quality score lines should exhibit different locality, an enhanced version of lossy compressor PBlock called ALL-CQS is proposed. In this paper, ALL-CQS applies adaptive locality criteria to smooth different quality ality scores lines automatically based on PBlock' lossy mechanism. Experimental results reveal that our lossy compressor ALL-CQS not only achieves the best variant calling performance which is very close to the lossless one, but also outperforms all the other state-of-the-art lossy compressors and achieves up to 145% improvements over the original lossless compressors in terms of compression ratio.
机译:由于随机性和嘈杂性,测序数据中显示的质量得分已经占压缩存储的70%,并达到了其无损压缩极限。质量分数的有损压缩可确保后续变体调用过程的执行,已成为大基因组数据分析的理想选择,也是一项巨大挑战。当前,基于所有地点的有损压缩机的最新技术PBlock基于所有质量得分线均应显示单一地点的假设,手动应用相同且静态的地点准则来平滑所有不同的质量得分线。但是,这种假设通常不是真实情况,并且不可避免地会导致某些质量得分行中的次优局部标准,从而最终导致有损压缩和变体调用过程的性能下降。因此,在更合理的假设下,即不同的质量分数线应表现出不同的局部性,提出了一种有损压缩机PBlock的增强版本,称为ALL-CQS。在本文中,ALL-CQS应用自适应局部性准则,基于PBlock的有损机制,自动平滑不同质量得分线。实验结果表明,我们的有损压缩机ALL-CQS不仅获得了与无损压缩机非常接近的最佳变体调用性能,而且还优于所有其他最新的有损压缩机,并实现了高达145%的改进在压缩比方面超过了原始的无损压缩机。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号