Indexing Highly Repetitive String Collections, Part Ⅱ: Compressed Indexes

Navarro Gonzalo

首页> 外文期刊>ACM computing surveys >Indexing Highly Repetitive String Collections, Part Ⅱ: Compressed Indexes

【24h】

Indexing Highly Repetitive String Collections, Part Ⅱ: Compressed Indexes

机译：Indexing Highly Repetitive String Collections, Part Ⅱ: Compressed Indexes

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相关主题

摘要

Two decades ago, a breakthrough in indexing string collections made it possible to represent them within their compressed space while at the same time offering indexed search functionalities. As this new technology permeated through applications like bioinformatics, the string collections experienced a growth that outperforms Moore's Law and challenges our ability of handling them even in compressed form. It turns out, fortunately, that many of these rapidly growing string collections are highly repetitive, so that their information content is orders of magnitude lower than their plain size. The statistical compression methods used for classical collections, however, are blind to this repetitiveness, and therefore a new set of techniques has been developed to properly exploit it. The resulting indexes form a new generation of data structures able to handle the huge repetitive string collections that we are facing. In this survey, formed by two parts, we cover the algorithmic developments that have led to these data structures.In this second part, we describe the fundamental algorithmic ideas and data structures that form the base of all the existing indexes, and the various concrete structures that have been proposed, comparing them both in theoretical and practical aspects, and uncovering some new combinations. We conclude with the current challenges in this fascinating field.

著录项

来源
《ACM computing surveys》 |2022年第2期|26.1-26.32|共32页
作者
Navarro Gonzalo;
展开▼
作者单位

Univ Chile, Ctr Biotechnol & Bioengn CeBiB, Beauchef 851, Santiago, Chile|Univ Chile, Millennium Inst Fdn Res Data IMFD, Dept Comp Sci, Beauchef 851, Santiago, Chile;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种英语
中图分类
关键词
Text indexing; string searching; compressed data structures; repetitive string collections;

Indexing Highly Repetitive String Collections, Part Ⅱ: Compressed Indexes

摘要

著录项

相关主题

期刊订阅