首页> 外文会议> >Efficient Compression of non-repetitive DNA sequences using Dynamic Programming
【24h】

Efficient Compression of non-repetitive DNA sequences using Dynamic Programming

机译:使用动态编程有效压缩非重复DNA序列

获取原文

摘要

DNA compression has been a subject of great interest since the availability of genomic databases. Although only two bits are sufficient to encode four bases of DNA ( namely A, G, T and C ), the massive size DNA sequences compels the need for efficient compression. General text compression methods do not make use of characteristics specific to DNA sequences. DNA specific compression algorithms usually take advantage of repeat sequences. DNA sequences with high repetition rates can be best compressed by dictionary-based compression algorithms. However segments of DNA that do not reappear in the sequence are compressed using different text compression scheme. In this paper, we propose an encoding scheme to compress non repeat regions of DNA sequences, based on dynamic programming approach. In order to test the efficiency of the method we incorporate the encoding scheme in a DNA-specific algorithm, DNAPack. The performance of this algorithm is compared with various DNA compression algorithms. The results show that our method achieve better results in many cases.
机译:自从基因组数据库的可用性以来,DNA压缩一直是引起人们极大兴趣的主题。尽管只有两个位足以编码DNA的四个碱基(即A,G,T和C),但是庞大的DNA序列迫使需要有效压缩。普通的文本压缩方法没有利用DNA序列特有的特征。 DNA特异性压缩算法通常利用重复序列。具有高重复率的DNA序列可以通过基于字典的压缩算法进行最佳压缩。但是,使用不同的文本压缩方案压缩序列中未出现的DNA片段。在本文中,我们提出了一种基于动态编程方法的压缩DNA序列非重复区域的编码方案。为了测试该方法的效率,我们将编码方案纳入了DNA特定算法DNAPack中。将该算法的性能与各种DNA压缩算法进行了比较。结果表明,我们的方法在许多情况下都取得了较好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号