A Compressed Self-index Using a Ziv-Lempel Dictionary

机译：使用Ziv-Lempel字典的压缩自索引

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

A compressed full-text self-index for a text T, of size u, is a data structure used to search patterns P, of size m, in T that requires reduced space, i.e. that depends on the empirical entropy (H_κ, H_0) of T, and is, furthermore, able to reproduce any substring of T. In this paper we present a new compressed self-index able to locate the occurrences of P in O((m + occ) log n) time, where occ is the number of occurrences and σ the size of the alphabet of T. The fundamental improvement over previous LZ78 based indexes is the reduction of the search time dependency on m from O(m~2) to O(m). To achieve this result we point out the main obstacle to linear time algorithms based on LZ78 data compression and expose and explore the nature of a recurrent structure in LZ-indexes, the T_(78) suffix tree. We show that our method is very competitive in practice by comparing it against the LZ-Index, the FM-index and a compressed suffix array.

机译：大小为u的文本T的压缩的全文本索引是一种数据结构，用于在T中搜索大小为m的模式P，该模式需要减小的空间，即取决于经验熵（H_κ，H_0） T，并且能够重现T的任何子串。在本文中，我们提出了一个新的压缩自索引，它能够在O（（m + occ）log n）时间中定位P的出现，其中occ是T的出现次数和σ的大小。相对于以前的基于LZ78的索引，根本的改进是将对m的搜索时间依赖性从O（m〜2）减少到O（m）。为了获得此结果，我们指出了基于LZ78数据压缩的线性时间算法的主要障碍，并揭示和探索了LZ索引的后缀结构T_（78）后缀树的性质。通过将其与LZ索引，FM索引和压缩后缀数组进行比较，我们证明了该方法在实践中非常有竞争力。

著录项

来源
《String Processing and Information Retrieval; Lecture Notes in Computer Science; 4209》|2006年|163-180|共18页
会议地点 Glasgow(GB)
作者
Luis M.S. Russo; Arlindo L. Oliveira;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类数据备份与恢复;
关键词

相似文献

外文文献
中文文献
专利

1. A compressed self-index using a Ziv-Lempel dictionary [J] . Luis M. S. Russo, Arlindo L. Oliveira Information retrieval . 2008,第4期

机译：使用Ziv-Lempel字典的压缩自索引
2. A compressed dynamic self-index for highly repetitive text collections [J] . Takaaki Nishimoto, Yoshimasa Takabatake, Yasuo Tabei Information and computation . 2020,第Auga期

机译：一种压缩的动态自我索引，用于高度重复的文本集合
3. Improved approximate string matching and regular expression matching on Ziv-Lempel compressed texts [J] . Bille P., Fagerberg R., G?rtz I.L. ACM transactions on algorithms . 2010,第1期

机译：在Ziv-Lempel压缩文本上改进了近似字符串匹配和正则表达式匹配
4. A Compressed Self-index Using a Ziv-Lempel Dictionary [C] . Luis M.S. Russo, Arlindo L. Oliveira International Conference on String Processing and Information Retrieval . 2006

机译：使用ziv-lempel字典的压缩自我索引
5. Joint Spatial-angular Sparse Coding, Compressed Sensing, and Dictionary Learning for Diffusion MRI [D] . Schwab, Evan. 2018

机译：扩散MRI的联合空间角稀疏编码，压缩感测和字典学习
6. Measurement Matrix Optimization for Compressed Sensing System with Constructed Dictionary via Takenaka–Malmquist Functions [O] . Qiangrong Xu, Zhichao Sheng, Yong Fang, 2021

机译：通过Takeaka-Malmquist函数用构造字典压缩传感系统的测量矩阵优化
7. A compressed self-index using a Ziv-Lempel dictionary [O] . Luís M. S. Russo, Arlindo L. Oliveira 2008

机译：使用Ziv-Lempel字典的压缩自索引

A Compressed Self-index Using a Ziv-Lempel Dictionary

摘要

著录项

相似文献

相关主题

期刊订阅