...
首页> 外文期刊>Journal of Bioinformatics and Computational Biology >WBFQC: A new approach for compressing next-generation sequencing data splitting into homogeneous streams
【24h】

WBFQC: A new approach for compressing next-generation sequencing data splitting into homogeneous streams

机译:WBFQC:一种用于压缩下一代测序数据拆分成同质流的新方法

获取原文
获取原文并翻译 | 示例
           

摘要

Genomic data nowadays is playing a vital role in number of fields such as personalized medicine, forensic, drug discovery, sequence alignment and agriculture, etc. With the advancements and reduction in the cost of next-generation sequencing (NGS) technology, these data are growing exponentially. NGS data are being generated more rapidly than they could be significantly analyzed. Thus, there is much scope for developing novel data compression algorithms to facilitate data analysis along with data transfer and storage directly. An innovative compression technique is proposed here to address the problem of transmission and storage of large NGS data. This paper presents a lossless non-reference-based FastQ file compression approach, segregating the data into three different streams and then applying appropriate and efficient compression algorithms on each. Experiments show that the proposed approach (WBFQC) outperforms other state-of-the-art approaches for compressing NGS data in terms of compression ratio (CR), and compression and decompression time. It also has random access capability over compressed genomic data. An open source FastQ compression tool is also provided here (http://www.algorithm-skg.com/wbfqc/home.html).
机译:现在基因组数据在个性化的药物,法医,药物发现,序列对准和农业等领域中发挥着重要作用。随着下一代测序(NGS)技术成本的进步和降低,这些数据是呈指数级增长。 NGS数据的产生比可以显着分析的更快。因此,开发新型数据压缩算法有很多范围,以便于直接数据分析以及数据传输和存储。这里提出了一种创新的压缩技术来解决大NGS数据的传输和存储问题。本文提出了一种无损的非参考基于FASTQ文件压缩方法,将数据分解为三个不同的流,然后在每个不同的流中应用适当和有效的压缩算法。实验表明,所提出的方法(WBFQC)优于在压缩比​​(CR)和压缩和减压时间方面压缩NGS数据的其他最先进的方法。它还具有通过压缩基因组数据的随机访问能力。这里还提供了一个开源FASTQ压缩工具(http://www.algorithm-skg.com/wbfqc/home.html)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号