Evaluating Top-k Queries Over Web-Accessible Databases

AMELIE MARIAN; NICOLAS BRUNO; LUIS GRAVANO

首页> 外文期刊>ACM transactions on database systems >Evaluating Top-k Queries Over Web-Accessible Databases

【24h】

Evaluating Top-k Queries Over Web-Accessible Databases

机译：通过Web可访问的数据库评估前k个查询

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

A query to a web search engine usually consists of a list of keywords, to which the search engine responds with the best or "top" k pages for the query. This top-k query model is prevalent over multimedia collections in general, but also over plain relational data for certain applications. For example, consider a relation with information on available restaurants, including their location, price range for one diner, and overall food rating. A user who queries such a relation might simply specify the user's location and target price range, and expect in return the best 10 restaurants in terms of some combination of proximity to the user, closeness of match to the target price range, and overall food rating. Processing top-k queries efficiently is challenging for a number of reasons. One critical such reason is that, in many web applications, the relation attributes might not be available other than through external web-accessible form interfaces, which we will have to query repeatedly for a potentially large set of candidate objects. In this article, we study how to process top-k queries efficiently in this setting, where the attributes for which users specify target values might be handled by external, autonomous sources with a variety of access interfaces. We present a sequential algorithm for processing such queries, but observe that any sequential top-k query processing strategy is bound to require unnecessarily long query processing times, since web accesses exhibit high and variable latency. Fortunately, web sources can be probed in parallel, and each source can typically process concurrent requests, although sources may impose some restrictions on the type and number of probes that they are willing to accept. We adapt our sequential query processing technique and introduce an efficient algorithm that maximizes source-access parallelism to minimize query response time, while satisfying source-access constraints. We evaluate our techniques experimentally using both synthetic and real web-accessible data and show that parallel algorithms can be significantly more efficient than their sequential counterparts.

机译：对Web搜索引擎的查询通常由关键字列表组成，搜索引擎用关键字的最佳或“前” k个页面对关键字进行响应。通常，此top-k查询模型普遍存在于多媒体集合中，但对于某些应用程序，也普遍存在于普通关系数据中。例如，考虑与可用餐厅信息有关的关系，包括其位置，一个餐厅的价格范围以及整体食品等级。查询这种关系的用户可以简单地指定用户的位置和目标价格范围，并根据与用户的接近程度，与目标价格范围的匹配程度以及整体食品等级的组合，期望获得最佳的10家餐厅。出于多种原因，有效处理前k个查询具有挑战性。一个关键的原因是，在许多Web应用程序中，除了通过外部可通过Web访问的表单界面外，关系属性可能不可用，我们将不得不反复查询潜在的大量候选对象集。在本文中，我们研究如何在此设置中有效地处理前k个查询，在该设置中，用户指定目标值的属性可能由具有各种访问接口的外部自主源处理。我们提出了一种用于处理此类查询的顺序算法，但是您会发现，由于Web访问展现出高且可变的延迟，因此任何顺序的top-k查询处理策略都必然需要不必要的长查询处理时间。幸运的是，尽管可以对Web源进行并行探测，但是每个源通常可以处理并发请求，尽管这些源可能会对它们愿意接受的探测的类型和数量施加一些限制。我们采用顺序查询处理技术，并引入了一种有效的算法，该算法可以在满足源访问约束的同时，最大化源访问并行性以最小化查询响应时间。我们使用合成数据和实际可通过网络访问的数据实验性地评估了我们的技术，并表明并行算法比顺序算法可以有效得多。

著录项

来源
《ACM transactions on database systems》 |2004年第2期|p.319-362|共44页
作者
AMELIE MARIAN; NICOLAS BRUNO; LUIS GRAVANO;
展开▼
作者单位

Computer Science Department, Columbia University, 1214 Amsterdam Ave., New York, NY 10027;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类计算机软件;
关键词
parallel query processing; query optimization; top-k query processing; web databases;

机译：并行查询处理;查询优化;top-k查询处理;Web数据库;

相似文献

外文文献
中文文献
专利

1. Semantics and evaluation of top-k queries in probabilistic databases [J] . Xi Zhang, Jan Chomicki Distributed and Parallel Databases . 2009,第1期

机译：概率数据库中前k个查询的语义和评估
2. Top-k Selection Queries over Relational Databases: Mapping Strategies and Performance Evaluation [J] . NICOLAS BRUNO, LUIS GRAVANO ACM transactions on database systems . 2002,第2期

机译：关系数据库的前k个选择查询：映射策略和性能评估
3. A new Top-k query processing algorithm to guarantee confidentiality of data and user queries on outsourced databases [J] . Hyeong-Jin Kim, Jae-Woo Chang International journal of systems assurance engineering and management . 2019,第5期

机译：一种新的Top-k查询处理算法，可确保外包数据库中数据和用户查询的机密性
4. Evaluating top-k queries over Web-accessible databases [C] . Bruno, N., Gravano, . 2002

机译：通过Web可访问的数据库评估前k个查询
5. CPR: Complex Pattern Ranking for Evaluating Top-k Pattern Queries over Event Streams. [D] . Wang, Xinxin. 2011

机译：CPR：用于评估事件流中前k个模式查询的复杂模式排名。
6. Geo-Social Top-k and Skyline Keyword Queries on Road Networks [O] . Muhammad Attique, Muhammad Afzal, Farman Ali, 2020

机译：道路网络上的Geo-Social Top-k和Skyline关键字查询
7. Evaluating Top-k Queries over Web-Accessible Databases [O] . Gravano Luis, Marian Amelie, Bruno Nicolas 2002

机译：通过Web可访问的数据库评估前k个查询

Evaluating Top-k Queries Over Web-Accessible Databases

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅