【24h】

A Novel Compression Technique For DNA Sequence Compaction

机译:一种用于DNA序列压实的新型压缩技术

获取原文

摘要

Modern Biotechnology produces large amount of genomic data. The explosion of DNA data has given a challenge for understanding genomic structure, the disk storage and computation. It is essential for the development of efficient compression techniques to handle genomic data storage. Data compression is used to store the data in less memory. The properties of DNA sequence offer a chance to build DNA specific compression algorithms. In this paper, a novel compression technique is proposed for genomic data. In the first stage, each base in DNA sequence is converted into binary form using 2-bit encoding system. On the resultant binary string, A Modified run length encoding is applied. The output is compressed again using Huffman encoding technique in second stage. The encoded sequence is converted into ASCII characters. This technique is quite simple and effective.
机译:现代生物技术产生大量基因组数据。 DNA数据的爆炸对了解基因组结构,磁盘存储和计算具有挑战。对于处理基因组数据存储的有效压缩技术至关重要。数据压缩用于将数据存储在更少的内存中。 DNA序列的性质提供了构建DNA特异性压缩算法的机会。本文提出了一种新的压缩技术,用于基因组数据。在第一阶段,使用2比特编码系统将DNA序列中的每个碱基转换为二元形式。在结果的二进制字符串上,应用了修改的运行长度编码。输出在第二阶段中使用Huffman编码技术再次压缩。编码序列被转换为ASCII字符。这种技术非常简单且有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号