...
【24h】

Searching a pattern in compressed DNA sequences.

机译:在压缩的DNA序列中搜索模式。

获取原文
获取原文并翻译 | 示例
           

摘要

This paper introduces a novel algorithm for DNA sequence compression that makes use of a transformation and statistical properties within the transformed sequence. A word based tagged code is used for identification of end of code. The word based encoder uses frequency distribution for assigning the code of words. The designed compression algorithm is efficient and effective for DNA sequence compression. As a statistical compression method, it is able to search the pattern inside the compressed text which is useful in knowledge discovery. Experiments show that our algorithm is shown to outperform existing compressors on typical DNA sequence datasets.
机译:本文介绍了一种新颖的DNA序列压缩算法,该算法利用了转化序列和转化序列中的统计特性。基于单词的标记代码用于识别代码结尾。基于单词的编码器使用频率分布来分配单词的代码。设计的压缩算法对于DNA序列压缩非常有效。作为统计压缩方法,它可以搜索压缩文本内部的模式,这对知识发现很有用。实验表明,我们的算法在典型的DNA序列数据集上表现优于现有的压缩器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号