POLYTOPE: a flexible sampling system for answering exploratory queries

Wu Zhigang; Jing Yinan; He Zhenying; Guo Chenghao; Wang X. Sean

首页> 外文期刊>World Wide Web >POLYTOPE: a flexible sampling system for answering exploratory queries

【24h】

POLYTOPE: a flexible sampling system for answering exploratory queries

机译：POLYTOPE：灵活的抽样系统，用于回答探索性查询

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Data exploration task is usually quite time-consuming. Analysts who want to find interests or verify their hypothesis may prefer a lower response time while tolerating a bounded error. Approximate query processing (AQP) is a convincing way to achieve this goal by leveraging some pre-computed samples to speed up this process. Existing sampling based AQP systems usually take a single sampling strategy on the whole dataset. However, during the data exploration tasks, various potential interests may distribute in different parts of dataset. To explore these interests, queries submitted by users thus show a rich diversity for separate sub-datasets. Therefore, only one single sampling strategy is obviously not competent for all queries accessing various sub-datasets. In this paper, we proposed a flexible and effective sampling system POLYTOPE especially designed for the data exploration tasks. To achieve this, we take the following three key ideas: (1) split the dataset into sampling blocks according to the user query patterns, (2) individually generate a set of optimized samples for each sampling block, and (3) automatically select an optimal sample at run time. We utilize both user query patterns and underlying data distribution to fulfill these ideas. We have implemented our system on the Spark platform and our comprehensive experimental results show that our system improved the accuracy performance up to 46% under the same time constraint for the data exploration tasks.

机译：数据探索任务通常非常耗时。想要发现兴趣或验证其假设的分析师可能更愿意在容忍有限错误的同时缩短响应时间。近似查询处理（AQP）是通过利用一些预先计算的样本来加速此过程的一种令人信服的方法。现有的基于采样的AQP系统通常对整个数据集采用单一采样策略。但是，在数据探索任务期间，各种潜在兴趣可能会分布在数据集的不同部分。为了探索这些兴趣，用户提交的查询因此对单独的子数据集显示了丰富的多样性。因此，显然只有一种采样策略不能胜任访问各种子数据集的所有查询。在本文中，我们提出了一种灵活且有效的采样系统POLYTOPE，该系统专门为数据探索任务而设计。为此，我们采取以下三个主要思想：（1）根据用户查询模式将数据集分为多个采样块；（2）为每个采样块分别生成一组优化的样本；（3）自动选择一个在运行时获得最佳样本。我们利用用户查询模式和基础数据分发来实现这些想法。我们已经在Spark平台上实现了我们的系统，综合实验结果表明，在数据探索任务的相同时间约束下，我们的系统将准确性提高了46％。

著录项

来源
《World Wide Web》 |2020年第1期|1-22|共22页
作者
Wu Zhigang; Jing Yinan; He Zhenying; Guo Chenghao; Wang X. Sean;
展开▼
作者单位

Shanghai Key Lab Data Sci Shanghai Peoples R China|Fudan Univ Sch Comp Sci Shanghai Peoples R China;

Shanghai Key Lab Data Sci Shanghai Peoples R China|Fudan Univ Sch Comp Sci Shanghai Peoples R China|Shanghai Inst Intelligent Elect & Syst Shanghai Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Sampling; Data exploration; Approximate query processing; Data warehouse;

机译：采样;数据探索;近似查询处理;数据仓库;

相似文献

外文文献
中文文献
专利

1. Fuzzy orderings in flexible query answering systems [J] . Bodenhofer U, Kung J Soft computing: A fusion of foundations, methodologies and applications . 2004,第7期

机译：灵活查询应答系统中的模糊排序
2. Report on FQAS 2002: Fifth International Conference on Flexible Query Answering Systems [J] . Amihai Motro, Troels Andreasen SIGMOD record . 2003,第4期

机译：FQAS 2002报告：第五届国际灵活查询应答系统会议
3. Dealing with Empty and Overabundant Answers to Flexible Queries [J] . Samyr Abrah?o Moises, Silvio do Lago Pereira Journal of Data Analysis and Information Processing . 2014,第1期

机译：处理对灵活查询的空洞和过多的答案
4. Flexible Intensional Query-Answering for RDF Peer-to-Peer Systems [C] . Zoran Majkic International Conference on Flexible Query Answering Systems . 2006

机译：RDF对等系统的灵活密集查询答案
5. A federated query answering system for semantic web data [D] . Li, Yingjie 2013

机译：语义Web数据的联合查询应答系统
6. Question answering system using Q A site corpus Query expansion and answer candidate evaluation [O] . Kanako Komiya, Yuji Abe, Hajime Morita, -1

机译：使用问答站点语料库的问答系统查询扩展和候选答案评估
7. Flexible Query Answering Systems 2015, Proceedings of the 11th International Conference FQAS 2015, Cracow, Poland, October 26-28, 2015 [O] . Andreasen, Troels, Christiansen, Henning, Kacprzyk, Janusz, 2016

机译：2015年灵活查询应答系统，第11届FQAS 2015国际会议论文集，波兰克拉科夫，2015年10月26日至28日

POLYTOPE: a flexible sampling system for answering exploratory queries

摘要

著录项

相似文献

相关主题

期刊订阅