首页> 外文会议>IEEE International Conference on Bioinformatics and Biomedicine >A new algorithm for ???the LCS problem??? with application in compressing genome resequencing data
【24h】

A new algorithm for ???the LCS problem??? with application in compressing genome resequencing data

机译:解决“ LCS问题”的新算法在压缩基因组重测序数据中的应用

获取原文

摘要

The longest common subsequence (LCS) problem is a classical problem in computer science, and forms the basis of the current best-performing reference-based compression schemes for genome resequencing data. First, we present a new algorithm for the LCS problem. Then, we introduce an LCS-motivated reference-based compression scheme using the components of the LCS, rather than the LCS itself. For the Homo sapiens genome (original size 3,080,436,051 bytes), our proposed scheme compressed the genome to 5,267,656 bytes). This can be compared with the previous best results of 19,666,791 bytes (Wang and Zhang, 2011) and 17,971,030 bytes (Pinho, Pratas, and Garcia, 2011). Thus, our compression ratio is about 3.73 to 3.41 times better than those from the state-of-the-art reference-based compression algorithms.
机译:最长的公共子序列(LCS)问题是计算机科学中的经典问题,它构成了当前最佳的基于参考的基因组重排数据压缩方案的基础。首先,我们提出了一种解决LCS问题的新算法。然后,我们使用LCS的组件而不是LCS本身,介绍一种基于LCS的基于参考的压缩方案。对于智人基因组(原始大小为3,080,436,051字节),我们提出的方案将基因组压缩为5,267,656字节)。这可以与之前的最佳结果19,666,791字节(Wang和Zhang,2011)和17,971,030字节(Pinho,Pratas和Garcia,2011)进行比较。因此,我们的压缩率比最新的基于参考的压缩算法高出约3.73到3.41倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号