Answering keyword queries through cached subqueries in best match retrieval models

Myron Papadakis; Yannis Tzitzikas

首页> 外文期刊>Journal of Intelligent Information Systems >Answering keyword queries through cached subqueries in best match retrieval models

【24h】

Answering keyword queries through cached subqueries in best match retrieval models

机译：通过最佳匹配检索模型中的缓存子查询回答关键字查询

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Caching is one of the techniques that Information Retrieval Systems (IRS) and Web Search Engines (WSEs) use to reduce processing costs and attain faster response times. In this paper we introduce Top- K SCRC (Set Cover Results Cache), a novel technique for results caching which aims at maximizing the utilization of cache. Identical queries are treated as in plain results caching (i.e. their evaluation does not require accessing the index), while combinations of cached sub-queries are exploited as in posting lists caching, however the exploited subqueries are not necessarily single-word queries. The problem of finding the right set of cached subqueries to answer an incoming query, is actually the Exact Set Cover problem. This technique can be applied in any best match retrieval model that is based on a decomposable scoring function, and we show that several best-match retrieval models (i.e VSM, Okapi BM25 and hybrid retrieval models) rely on such scoring functions. To increase the capacity (in queries) of the cache only the top-K results of each cached query are stored and we introduce metrics for measuring the accuracy of the composed top-K answer. By analyzing queries submitted to real-world WSEs, we verified that there is a significant proportion of queries whose terms is the result of a union of the terms of other queries. The comparative evaluation over traces of real query sets showed that the Top-K SCRC is on the average two times faster than a plain Top-K RC for the same cache size.

机译：缓存是信息检索系统（IRS）和Web搜索引擎（WSE）用来降低处理成本并获得更快响应时间的技术之一。在本文中，我们介绍了Top-K SCRC（设置覆盖结果缓存），这是一种用于结果缓存的新颖技术，旨在最大程度地利用缓存。相同的查询被视为纯结果缓存（即，它们的求值不需要访问索引），而缓存的子查询的组合与发布列表缓存一样被利用，但是被利用的子查询不一定是单字查询。找到正确的缓存子查询集来回答传入查询的问题实际上是“精确集覆盖”问题。可以将这种技术应用于基于可分解评分功能的任何最佳匹配检索模型，并且我们证明了几种最佳匹配检索模型（即VSM，Okapi BM25和混合检索模型）都依赖于这种评分功能。为了增加缓存的容量（在查询中），仅存储每个缓存查询的前K个结果，并且我们引入度量标准来测量组成的前K个答案的准确性。通过分析提交给实际WSE的查询，我们验证了有很大一部分查询的条件是其他查询条件的并集的结果。对真实查询集的痕迹的比较评估表明，对于相同的缓存大小，Top-K SCRC平均比普通的Top-K RC快两倍。

著录项

来源
《Journal of Intelligent Information Systems》 |2015年第1期|67-106|共40页
作者
Myron Papadakis; Yannis Tzitzikas;
展开▼
作者单位

Institute of Computer Science (ICS), Foundation for Research and Technology - Hellas (FORTH),Science and Technology Park of Crete, Vassilika Vouton, P.O. Box 1385, Heraklion, Crete, 7110, Greece Computer Science Department, University of Crete, Voutes Campus, 700 13 Heraklion, Crete, Greece;

Institute of Computer Science (ICS), Foundation for Research and Technology - Hellas (FORTH),Science and Technology Park of Crete, Vassilika Vouton, P.O. Box 1385, Heraklion, Crete, 7110, Greece Computer Science Department, University of Crete, Voutes Campus, 700 13 Heraklion, Crete, Greece;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Information retrieval; Query processing; Retrieval models; Ranking; Web search engines; Query log analysis;

机译：信息检索;查询处理;检索模型;排行;网络搜索引擎;查询日志分析;

相似文献

外文文献
中文文献
专利

1. Answering keyword queries through cached subqueries in best match retrieval models [J] . Myron Papadakis, Yannis Tzitzikas Journal of Intelligent Information Systems . 2015,第1期

机译：通过最佳匹配检索模型中的缓存子查询回答关键字查询
2. Answering top-K query combined keywords and structural queries on RDF graphs [J] . Peng Peng, Zou Lei, Qin Zheng Information Systems . 2017,第JULa期

机译：在RDF图上回答top-K查询组合关键字和结构查询
3. Answering why-not questions on top-k augmented spatial keyword queries [J] . Li Yanhong, Zhang Wang, Luo Changyin, Knowledge-Based Systems . 2021,第Jula8期

机译：回答为什么 - 在top-k增强的空间关键字查询上的问题
4. Inductive Query Answering and Concept Retrieval Exploiting Local Models [C] . dAmato Claudia, Fanizzi Nicola, Esposito Floriana, Intelligent Systems Design and Applications, 2009. ISDA '09 . 2009

机译：利用局部模型的归纳式查询回答和概念检索
5. Consistent query answering of conjunctive queries under primary key constraints. [D] . Pema, Enela. 2014

机译：主键约束下的联合查询的一致查询应答。
6. Using the Weighted Keyword Model to Improve Information Retrieval for Answering Biomedical Questions [O] . Hong Yu, Yong-gang Cao 2009

机译：使用加权关键字模型改善回答生物医学问题的信息检索
7. Answering Keyword Queries through Cached Subqueries in Best Match Retrieval Models [O] . Myron Papadakis, Yannis Tzitzikas 2015

机译：在最佳匹配检索模型中通过缓存子查询回答关键字查询

Answering keyword queries through cached subqueries in best match retrieval models

摘要

著录项

相似文献

相关主题

期刊订阅