首页> 外文期刊>ACM Transactions on Information Systems >Selective Search: Efficient and Effective Search of Large Textual Collections
【24h】

Selective Search: Efficient and Effective Search of Large Textual Collections

机译:选择性搜索:大型文本集的高效搜索

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

The traditional search solution for large collections divides the collection into subsets (shards), and processes the query against all shards in parallel (exhaustive search). The search cost and the computational requirements of this approach are often prohibitively high for organizations with few computational resources. This article investigates and extends an alternative: selective search, an approach that partitions the dataset based on document similarity to obtain topic-based shards, and searches only a few shards that are estimated to contain relevant documents for the query. We propose shard creation techniques that are scalable, efficient, self-reliant, and create topic-based shards with low variance in size, and high density of relevant documents.
机译:针对大型集合的传统搜索解决方案将集合分为子集(分片),并并行处理针对所有分片的查询(穷举搜索)。对于具有较少计算资源的组织,此方法的搜索成本和计算要求通常过高。本文研究并扩展了另一种方法:选择性搜索,一种基于文档相似性对数据集进行分区以获取基于主题的分片的方法,并且仅搜索一些估计包含相关查询文档的分片。我们提出了可扩展,高效,自力更生的分片创建技术,并创建了基于主题的分片,这些分片的大小差异很小,相关文档的密度很高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号