首页> 外文会议>Information retrieval technology >Efficient Top-k Document Retrieval Using a Term-Document Binary Matrix
【24h】

Efficient Top-k Document Retrieval Using a Term-Document Binary Matrix

机译:使用术语文档二进制矩阵进行有效的Top-k文档检索

获取原文
获取原文并翻译 | 示例

摘要

Current web search engines perform well for "navigational queries." However, due to their use of simple conjunctive Boolean fil ters, such engines perform poorly for "informational queries." Informa tional queries would be better handled by a web search engine using an informational retrieval model along with a combination of enhance ment techniques such as query expansion and relevance feedback, and the realization of such a engine requires a method to prosess the model efficiently. In this paper, we describe a novel extension of an existing top-k query processing technique. We add a simple data structure called a "term-document binary matrix," resulting in more efficient evaluation of top-k queries even when the queries have been expanded. We show on the basis of experimental evaluation using the TREC GOV2 data set and expanded versions of the evaluation queries attached to this data set that the expanded technique achieves significant performance gains over existing techniques.
机译:当前的网络搜索引擎在“导航查询”中表现良好。但是,由于使用了简单的联合布尔过滤器,因此此类引擎在“信息查询”中的性能较差。 Web搜索引擎可以使用信息检索模型以及诸如查询扩展和相关性反馈之类的增强技术的组合来更好地处理信息查询,而这种引擎的实现需要一种有效处理模型的方法。在本文中,我们描述了现有top-k查询处理技术的新颖扩展。我们添加了一个简单的数据结构,称为“术语文档二进制矩阵”,即使对查询进行了扩展,也可以更有效地评估前k个查询。我们在使用TREC GOV2数据集以及附加到该数据集的评估查询的扩展版本的实验评估的基础上表明,与现有技术相比,扩展技术可显着提高性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号