首页> 外文会议>International Symposium on Algorithms and computation >Space-Efficient Data Structures for Flexible Text Retrieval Systems
【24h】

Space-Efficient Data Structures for Flexible Text Retrieval Systems

机译:用于灵活文本检索系统的空间高效数据结构

获取原文

摘要

We propose space-efficient data structures for text retrieval systems that have merits of both theoretical data structures like suffix trees and practical ones like inverted files. Traditional text retrieval systems use the inverted files and support ranking queries based on the tf*idf (term frequency times inverse document frequency) scores of documents that contain given keywords, which cannot be solved by using only the suffix trees. A drawback of the systems is that the scores can be computed for only predetermined keywords. We extend the data structure so that the scores can be computed for any pattern efficiently while keeping the size of the data structures moderate. The size is comparable with the text size, which is an improvement from existing methods using O(n log n) bit space for a text collection of length n.
机译:我们为文本检索系统提出了空间高效的数据结构,这些系统具有与后缀树和实际类似的理论数据结构的优点,如反相文件。传统的文本检索系统使用反转文件并根据包含给定关键字的文档的TF * IDF(术语频率次数逆文档频率)的分数来支持排名查询,这些文件只能通过仅使用后缀树来解决。系统的缺点是可以仅计算得分以仅用于预定的关键字。我们扩展数据结构,使得可以有效地计算得分,同时保持数据结构的大小中等。大小与文本大小相当,这是使用O(n log n)比特空间的现有方法的改进,用于文本N的文本n。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号