首页> 外文会议>String Processing and Information Retrieval; Lecture Notes in Computer Science; 4209 >Inverted Files Versus Suffix Arrays for Locating Patterns in Primary Memory
【24h】

Inverted Files Versus Suffix Arrays for Locating Patterns in Primary Memory

机译:倒置文件与后缀数组用于在主存储器中定位模式

获取原文
获取原文并翻译 | 示例

摘要

Recent advances in the asymptotic resource costs of pattern matching with compressed suffix arrays are attractive, but a key rival structure, the compressed inverted file, has been dismissed or ignored in papers presenting the new structures. In this paper we examine the resource requirements of compressed suffix array algorithms against compressed inverted file data structures for general pattern matching in genomic and English texts. In both cases, the inverted file indexes q-grams, thus allowing full pattern matching capabilities, rather than simple word based search, making their functionality equivalent to the compressed suffix array structures. When using equivalent memory for the two structures, inverted files are faster at reporting the location of patterns when the number of occurrences of the patterns is high.
机译:带有压缩后缀数组的模式匹配的渐近资源成本方面的最新进展很有吸引力,但是在提出新结构的论文中,关键的竞争结构压缩后的倒排文件已被忽略或忽略。在本文中,我们针对基因组和英文文本中的常规模式匹配,针对压缩倒排文件数据结构检查了压缩后缀数组算法的资源需求。在这两种情况下,反向文件都对q-gram进行索引,从而提供完整的模式匹配功能,而不是简单的基于单词的搜索,从而使其功能等效于压缩后缀数组结构。当对这两种结构使用等效的内存时,当模式的出现次数很高时,反向文件将更快速地报告模式的位置。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号