
One-Bit DNA Compression Algorithm




Recently, the ever-increasing growth of genomic sequences DNA or RNA stored in databases poses a serious challenge to the storage, process and transmission of these data. Hence effective management of genetic data is very necessary which makes data compression unavoidable. The current standard compression tools are insufficient for DNA sequences compression. In this paper we proposed an efficient lossless DNA compression algorithm based One-Bit Compression method (OBComp) that will compress both repeated and non-repeated sequences. Unlike direct coding technique where two bits are assigned to each nucleotide resulting compression ratio of 2 bits per byte (bpb), OBComp used just a single bit 0 or 1 to code the two highest occurrence nucleotides. The positions of the two others are saved. To further enhance the compression, modified version of Run Length Encoding technique and Huffman coding algorithm are then applied respectively. The proposed algorithm has efficiently reduced the original size of DNA sequences. The easy way to implement our algorithm and the remarkable compression ratio makes its use interesting.



  • 外文文献
  • 中文文献
  • 专利


京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号