【24h】

One-Bit DNA Compression Algorithm

机译:一比特DNA压缩算法

获取原文

摘要

Recently, the ever-increasing growth of genomic sequences DNA or RNA stored in databases poses a serious challenge to the storage, process and transmission of these data. Hence effective management of genetic data is very necessary which makes data compression unavoidable. The current standard compression tools are insufficient for DNA sequences compression. In this paper we proposed an efficient lossless DNA compression algorithm based One-Bit Compression method (OBComp) that will compress both repeated and non-repeated sequences. Unlike direct coding technique where two bits are assigned to each nucleotide resulting compression ratio of 2 bits per byte (bpb), OBComp used just a single bit 0 or 1 to code the two highest occurrence nucleotides. The positions of the two others are saved. To further enhance the compression, modified version of Run Length Encoding technique and Huffman coding algorithm are then applied respectively. The proposed algorithm has efficiently reduced the original size of DNA sequences. The easy way to implement our algorithm and the remarkable compression ratio makes its use interesting.
机译:最近,存储在数据库中的基因组序列DNA或RNA的不断增长对这些数据的存储,处理和传输提出了严峻的挑战。因此,有效管理遗传数据非常必要,这使得数据压缩不可避免。当前的标准压缩工具不足以进行DNA序列压缩。在本文中,我们提出了一种有效的基于位压缩方法(OBComp)的无损DNA压缩算法,该算法将压缩重复序列和非重复序列。与直接编码技术不同,直接编码技术将两个位分配给每个核苷酸,从而导致每字节2位的压缩率(bpb),OBComp仅使用单个位0或1来编码两个出现率最高的核苷酸。其他两个的位置将被保存。为了进一步提高压缩率,然后分别应用了运行长度编码技术和霍夫曼编码算法的修改版本。所提出的算法有效地减小了DNA序列的原始大小。实现我们的算法的简便方法和出色的压缩率使其用途变得有趣。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号