Inverted Files Versus Suffix Arrays for Locating Patterns in Primary Memory

机译：倒置文件与后缀数组用于在主存储器中定位模式

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recent advances in the asymptotic resource costs of pattern matching with compressed suffix arrays are attractive, but a key rival structure, the compressed inverted file, has been dismissed or ignored in papers presenting the new structures. In this paper we examine the resource requirements of compressed suffix array algorithms against compressed inverted file data structures for general pattern matching in genomic and English texts. In both cases, the inverted file indexes q-grams, thus allowing full pattern matching capabilities, rather than simple word based search, making their functionality equivalent to the compressed suffix array structures. When using equivalent memory for the two structures, inverted files are faster at reporting the location of patterns when the number of occurrences of the patterns is high.

机译：带有压缩后缀数组的模式匹配的渐近资源成本方面的最新进展很有吸引力，但是在提出新结构的论文中，关键的竞争结构压缩后的倒排文件已被忽略或忽略。在本文中，我们针对基因组和英文文本中的常规模式匹配，针对压缩倒排文件数据结构检查了压缩后缀数组算法的资源需求。在这两种情况下，反向文件都对q-gram进行索引，从而提供完整的模式匹配功能，而不是简单的基于单词的搜索，从而使其功能等效于压缩后缀数组结构。当对这两种结构使用等效的内存时，当模式的出现次数很高时，反向文件将更快速地报告模式的位置。

著录项

来源
《String Processing and Information Retrieval; Lecture Notes in Computer Science; 4209》|2006年|122-133|共12页
会议地点 Glasgow(GB)
作者
Simon J. Puglisi; W. F. Smyth; Andrew Turpin;
展开▼
作者单位

Curtin University of Technology, Perth, Australia;

McMaster University, Hamilton, Canada;

RMIT University, Melbourne, Australia;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类数据备份与恢复;
关键词

相似文献

外文文献
中文文献
专利

1. Repeated patterns detection in big data using classification and parallelism on LERP Reduced Suffix Arrays [J] . Xylogiannopoulos Konstantinos F., Karampelas Panagiotis, Alhajj Reda Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2016,第3期

机译：在LERP减少后缀数组上使用分类和并行性在大数据中进行重复模式检测
2. A quick tour on suffix arrays and compressed suffix arrays [J] . Roberto Grossi Theoretical computer science . 2011,第27期

机译：快速浏览后缀数组和压缩后缀数组
3. Linearized Suffix Tree: an Efficient Index Data Structure with the Capabilities of Suffix Trees and Suffix Arrays [J] . Dong Kyue Kim, Minhwan Kim, Heejin Park Algorithmica . 2008,第3期

机译：线性化后缀树：具有后缀树和后缀数组功能的高效索引数据结构
4. Inverted Files Versus Suffix Arrays for Locating Patterns in Primary Memory [C] . Simon J. Puglisi, W. F. Smyth, Andrew Turpin International Conference on String Processing and Information Retrieval . 2006

机译：反转文件与后缀阵列用于定位主存储器中的模式
5. Suffix trees and suffix arrays in primary and secondary storage [D] . Ko, Pang 2007

机译：主存储和辅助存储中的后缀树和后缀数组
6. GHOSTX: An Improved Sequence Homology Search Algorithm Using a Query Suffix Array and a Database Suffix Array [O] . Shuji Suzuki, Masanori Kakuta, Takashi Ishida, -1

机译：GHOSTX：使用查询后缀数组和数据库后缀数组的改进的序列同源性搜索算法
7. Suffix trees and suffix arrays in primary and secondary storage [O] . Ko, Pang 2007

机译：主存储和辅助存储中的后缀树和后缀数组
8. Superimposed Coding Versus Sequential and Inverted Files. [R] . hickey,thomas butler -1

机译：叠加编码对战顺序和倒排文件。

Inverted Files Versus Suffix Arrays for Locating Patterns in Primary Memory

摘要

著录项

相似文献

相关主题

期刊订阅