Data Locality-Aware Big Data Query Evaluation in Distributed Clouds

Qiufen Xia; Weifa Liang; Zichuan Xu

首页> 外文期刊>The Computer journal >Data Locality-Aware Big Data Query Evaluation in Distributed Clouds

【24h】

Data Locality-Aware Big Data Query Evaluation in Distributed Clouds

机译：分布式云中数据本地感知的大数据查询评估

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

With more and more businesses and organizations outsourcing their IT services to distributed clouds for cost savings, historical and operational data generated by the services have been growing exponentially. The generated data that are referred to as big data, stored at different geographic datacenters, now become an invaluable asset to these businesses and organizations, as they can make use of the data through analysis to identify business advantages and make strategic decisions. Big data analytics thus has been emerged as a main research topic in cloud computing. To efficiently evaluate a big data analytic query in a distributed cloud consisting of multiple datacenters at different geographic locations interconnected by the Internet, it poses great challenges: (i) the source data of the query typically are located at different datacenters; and (ii) the resource demands of the query may be beyond the supplies of any single datacenter at that moment. In this paper, we formulate an online query evaluation problem for big data analytic queries in distributed clouds, with an objective to maximize the query acceptance ratio while minimizing the accumulative query evaluation cost, for which we first propose a novel metric to model the usages of different resources in the distributed cloud, by incorporating the capacities and workloads of different datacenters and links, as well as resource demands of different queries. We then devise efficient online algorithms for query evaluations under both unsplittable and splittable source data assumptions. We finally conduct extensive experiments by simulations to evaluate the performance of the proposed algorithms. Experimental results demonstrate that the proposed algorithms are promising, and outperform other heuristics at 95% confidence intervals.

机译：随着越来越多的企业和组织将其IT服务外包到分布式云以节省成本，服务生成的历史数据和运营数据呈指数增长。存储在不同地理数据中心的称为大数据的生成数据现在已成为这些企业和组织的宝贵资产，因为它们可以通过分析利用数据来识别业务优势并做出战略决策。因此，大数据分析已成为云计算的主要研究主题。为了在由互联网互连的位于不同地理位置的多个数据中心组成的分布式云中有效地评估大数据分析查询，提出了巨大的挑战：（i）查询的源数据通常位于不同的数据中心；（ii）此时查询的资源需求可能超出任何单个数据中心的能力。在本文中，我们针对分布式云中的大数据分析查询制定了一个在线查询评估问题，目的是在最大化查询接受率的同时最大程度地减少累积查询评估成本，为此，我们首先提出了一种新的度量模型来建模通过合并不同数据中心和链接的容量和工作负载以及不同查询的资源需求，可以在分布式云中使用不同的资源。然后，我们针对无法拆分和可拆分的源数据假设，设计了有效的在线算法来进行查询评估。我们最终通过仿真进行了广泛的实验，以评估所提出算法的性能。实验结果表明，所提出的算法是有前途的，并且在95％置信区间内优于其他启发式算法。

著录项

来源
《The Computer journal》 |2017年第6期|791-809|共19页
作者
Qiufen Xia; Weifa Liang; Zichuan Xu;
展开▼
作者单位

Research School of Computer Science, The Australian National University, Canberra, ACT 2601, Australia;

Research School of Computer Science, The Australian National University, Canberra, ACT 2601, Australia;

Research School of Computer Science, The Australian National University, Canberra, ACT 2601, Australia;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
query evaluation optimization; minimum cost multicommodity flow; big data analytics; data locality; distributed clouds;

机译：查询评估优化;最低成本的多商品流;大数据分析;数据局部性;分布式云;

相似文献

外文文献
中文文献
专利

1. Locality-aware process placement for parallel and distributed simulation in cloud data centers [J] . Zaheer Saad, Malik Asad Waqar, Rahman Anis Ur, Journal of supercomputing . 2019,第11期

机译：在云数据中心进行并行和分布式仿真的位置感知流程放置
2. Adaptive mechanism for distributed query processing and data loading using the RDF data in the cloud [J] . Dharmaraj Chandrasekaran Ranichandra, Tripathy BalaKrushna International journal of communication systems . 2018,第15期

机译：使用云中的RDF数据进行分布式查询处理和数据加载的自适应机制
3. GEODIS: towards the optimization of data locality-aware job scheduling in geo-distributed data centers [J] . Moïse W. Convolbo, Jerry Chou, Ching-Hsien Hsu, Computing . 2018,第1期

机译：GEODIS：致力于优化地理分布式数据中心中的数据本地性作业调度
4. Data Locality-Aware Query Evaluation for Big Data Analytics in Distributed Clouds [C] . Xia Qiufen, Liang Weifa, Xu Zichuan International Conference on Advanced Cloud and Big Data . 2015

机译：分布式云中大数据分析的数据位置感知查询评估
5. Distributed Query Processing Over Incomplete, Sampled, and Locality-Aware Data [D] . Sundarmurthy, Bruhathi. 2018

机译：对不完整，采样和位置感知的数据进行分布式查询处理
6. Knowledge and Theme Discovery across Very Large Biological Data Sets Using Distributed Queries: A Prototype Combining Unstructured and Structured Data [O] . Uma S. Mudunuri, Mohamad Khouja, Stephen Repetski, -1

机译：使用分布式查询跨非常大的生物数据集进行知识和主题发现：结合非结构化和结构化数据的原型
7. Data Locality-Aware Big Data Query Evaluation in Distributed Clouds [O] . Qiufen Xia, Weifa Liang, Zichuan Xu 2017

机译：数据位置感知分布式云中的大数据查询评估

Data Locality-Aware Big Data Query Evaluation in Distributed Clouds

摘要

著录项

相似文献

相关主题

期刊订阅