首页> 外文期刊>Computing reviews >Faster compressed suffix trees for repetitive collections
【24h】

Faster compressed suffix trees for repetitive collections

机译:更快的压缩后缀树,用于重复收集

获取原文
获取原文并翻译 | 示例
           

摘要

The suffix tree is a celebrated data structure in stringology and is used in providing efficient solutions for a plethora of problems. The main problem of suffix trees is their space usage: they may even require 20 bytes per text symbol! One solution to this issue is compressed suffix trees (CSTs). This paper proposes a new CST called grammar-compressed topology (GCT). GCT achieves low space on repetitive collections and much better times. In fact, GCT can be seen as a specialist for highly repetitive collections: the experiments of this paper show that on synthetic DNA collections with 99.9 percent similarity, GCT uses slightly higher space but runs up to three orders of magnitude faster.
机译:后缀树是字符串学中著名的数据结构,用于为众多问题提供有效的解决方案。后缀树的主要问题是它们的空间使用情况:每个文本符号甚至可能需要20个字节!解决此问题的一种方法是压缩后缀树(CST)。本文提出了一种新的CST,称为语法压缩拓扑(GCT)。 GCT在重复收藏上实现了较小的空间,并获得了更好的时间。实际上,GCT可以看作是高度重复性集合的专家:本文的实验表明,在具有99.9%相似性的合成DNA集合中,GCT使用的空间稍大,但运行速度却快了三个数量级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号