首页> 外文会议>Computing and combinatorics >Compressed Directed Acyclic Word Graph with Application in Local Alignment
【24h】

Compressed Directed Acyclic Word Graph with Application in Local Alignment

机译:压缩有向无环词图及其在局部对齐中的应用

获取原文
获取原文并翻译 | 示例

摘要

Suffix tree, suffix array, and directed acyclic word graph (DAWG) are data-structures for indexing a text. Although they enable efficient pattern matching, their data-structures require O(nlogn) bits, which make them impractical to index long text like human genome. Recently, the development of compressed data-structures allow us to simulate suffix tree and suffix array using O(n) bits. However, there is still no O(n)-bit data-structure for DAWG with full functionality. This work introduces an O(n)-bit data-structure for simulating DAWG. Besides, we also propose an application of DAWG to improve the time complexity for the local alignment problem. In this application, the previously proposed solutions using BWT (a version of compressed suffix tree) run in O(n~2m) worst case time and O(nm~(0.628)) average case time where n and m are the lengths of the database and the query, respectively. Using compressed DAWG proposed in this paper, the problem can be solved in O(nm) worst case time and the same average case time.
机译:后缀树,后缀数组和有向无环字图(DAWG)是用于索引文本的数据结构。尽管它们可以实现有效的模式匹配,但是它们的数据结构需要O(nlogn)位,这使它们不适合索引人类基因组之类的长文本。最近,压缩数据结构的发展使我们能够使用O(n)位来模拟后缀树和后缀数组。但是,仍然没有具有完整功能的DAWG的O(n)位数据结构。这项工作介绍了用于模拟DAWG的O(n)位数据结构。此外,我们还提出了DAWG的应用,以提高局部对准问题的时间复杂度。在此应用中,先前提出的使用BWT(压缩后缀树的一种版本)的解决方案在O(n〜2m)最坏情况时间和O(nm〜(0.628))平均情况下运行,其中n和m是长度的长度。数据库和查询。使用本文提出的压缩DAWG,可以在O(nm)最坏情况时间和相同的平均情况下解决该问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号