首页> 外文会议>International conference on information and knowledge management >Predicting the cost-quality trade-off for information retrieval queries: Facilitating database design and query optimization
【24h】

Predicting the cost-quality trade-off for information retrieval queries: Facilitating database design and query optimization

机译:预测信息检索查询的成本质量折衷:促进数据库设计和查询优化

获取原文

摘要

Efficient, flexible, and scalable integration of full text information retrieval (IR) in a DBMS is not a trivial case. This holds in particular for query optimization in such a context. To facilitate the bulk-oriented behavior of database query processing, a priori knowledge of how to limit the data efficiently prior to query evaluation is very valuable at optimization time. The usually imprecise nature of IR querying provides an extra opportunity to limit the data by a trade-off with the quality of the answer. In this paper we present a mathematically derived model to predict the quality implications of neglecting information before query execution. In particular we investigate the possibility to predict the retrieval quality for a document collection for which no training information is available, which is usually the case in practice. Instead, we construct a model that can be trained on other document collections for which the necessary quality information is available, or can be obtained quite easily. We validate our model for several document collections and present the experimental results. These results show that our model performs quite well, even for the case were we did not train it on the test collection itself.
机译:在DBMS中的完整文本信息检索(IR)的高效,灵活和可扩展集成不是琐碎的情况。这尤其在这样的上下文中保持查询优化。为了促进数据库查询处理的批量取向行为,在查询评估之前有效地限制数据的先验知识在优化时非常有价值。 IR查询的通常不精确性质提供了额外的机会,以通过答案的质量来限制数据。在本文中,我们介绍了一个数学上派生的模型,以预测查询执行前忽略信息的质量意义。特别是,我们调查了预测未提供培训信息的文档收集的检索质量的可能性,这通常是实践中的情况。相反,我们构建一个模型,可以在其他文档集合中培训,其中必要的质量信息可用,或者可以很容易地获得。我们验证了多个文档收集的模型,并提出了实验结果。这些结果表明,我们的模型表现得很好,即使是这种情况,我们也没有在测试集合本身上培训。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号