Workload-Aware Incremental Repartitioning of Shared-Nothing Distributed Databases for Scalable Cloud Applications

机译：可扩展的云应用程序的无共享分布式数据库的工作负载感知增量分区

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Cloud applications often rely on shared-nothing distributed databases that can sustain rapid growth in data volume. Distributed transactions (DTs) that involve data tuples from multiple geo-distributed servers can adversely impact the performance of such databases, especially when the transactions are short-lived in and require immediate response. The k-way min-cut graph clustering algorithm has been found effective to reduce the number of DTs with acceptable level of load balancing. Benefits of such a static partitioning scheme, however, is short-lived in Cloud applications with dynamically varying workload patterns where DT profile changes over time. This paper addresses this emerging challenge by introducing incremental repartitioning. In each repartitioning cycle, DT profile is learnt online and k-way min-cut clustering algorithm is applied on a special sub-graph representing all DTs as well as those non-DTs that have at least one tuple in a DT. The latter ensures that the min-cut algorithm minimally reintroduces new DTs from the non-DTs while maximally transforming existing DTs into non-DTs in the new partitioning. Potential load imbalance risk is mitigated by applying the graph clustering algorithm on the finer logical partitions instead of the servers and relying on random one-to-one cluster-to-partition mapping that naturally balances out loads. Inter-server data-migration due to repartitioning is kept in check with two special mappings favouring the current partition of majority tuples in a cluster -- the many-to-one version minimising data migrations alone and the one-to-one version reducing data migration without affecting load balancing. A distributed data lookup process, inspired by the roaming protocol in mobile networks, is introduced to efficiently handle data migration without affecting scalability. The effectiveness of the proposed framework is evaluated on realistic TPC-C workloads comprehensively using graph, hyper graph, and compressed hyper gr- ph representations used in the literature. Simulation results convincingly support incremental repartitioning against static partitioning.

机译：云应用程序通常不依赖任何共享的分布式数据库，这些数据库可以维持数据量的快速增长。涉及来自多个地理分布式服务器的数据元组的分布式事务（DT）可能会对此类数据库的性能产生不利影响，尤其是在事务短暂且需要立即响应的情况下。已经发现k路最小割图聚类算法可以有效地减少DT的数量，并具有可接受的负载平衡级别。但是，这种静态分区方案的好处在具有动态变化的工作负载模式的Cloud应用程序中是短暂的，其中DT配置文件随时间变化。本文通过引入增量重新分区解决了这一新兴挑战。在每个重新分区周期中，都将在线学习DT配置文件，并在表示所有DT以及在DT中具有至少一个元组的那些非DT的特殊子图上应用k向最小割聚类算法。后者确保最小剪切算法从非DT最小限度地重新引入新DT，同时在新分区中最大程度地将现有DT转换为非DT。通过在更精细的逻辑分区而不是服务器上应用图聚类算法，并依靠自然地平衡负载的随机一对一群集到分区映射，可以减轻潜在的负载不平衡风险。通过两个特殊的映射来控制由于分区而导致的服务器间数据迁移，这两个映射有利于集群中大多数元组的当前分区-多对一版本仅减少数据迁移，而一对一版本减少数据迁移而不会影响负载平衡。引入了受移动网络中漫游协议启发的分布式数据查找过程，以在不影响可伸缩性的情况下有效地处理数据迁移。使用图，超图和文献中使用的压缩超图表示法，可以在实际的TPC-C工作负载上全面评估所提出框架的有效性。仿真结果令人信服地支持针对静态分区的增量重新分区。

著录项

来源
《IEEE/ACM International Conference on Utility and Cloud Computing》|2014年|213-222|共10页
会议地点
作者
Kamal Joarder Mohammad Mustafa; Murshed Manzur; Buyya Rajkumar;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
cloud computing; data mining; distributed databases; graph theory; pattern clustering; resource allocation; DT profile; compressed hypergraph representations; distributed transactions; interserver data-migration; k-way mincut graph clustering algorithm; load balancing; mobile networks; multiple geodistributed servers; one-to-one cluster-to-partition mapping; potential load imbalance risk; realistic TPC-C workloads; roaming protocol; scalable cloud application; static partitioning scheme; workload-aware incremental repartitioning; Clustering algorithms; Distributed databases; Heuristic algorithms; Partitioning algorithms; Routing; Servers; Cloud databases; data migration; distributed transactions; incremental repartitioning; load-balance; workload;

机译：云计算;数据挖掘;分布式数据库;图论;模式聚类;资源分配; DT配置文件;压缩超图表示法;分布式事务;服务器间数据迁移; k向mincut图聚类算法;负载均衡;移动网络;多个地理分布式服务器;一对一的集群到分区的映射;潜在的负载不平衡风险;现实的TPC-C工作负载;漫游协议;可扩展的云应用程序;静态分区方案;可感知工作负载的增量重新分区;聚类算法;分布式数据库;启发式算法;分区算法;路由;服务器;云数据库;数据迁移;分布式事务;增量重新分区;负载均衡;工作量;

相似文献

外文文献
中文文献
专利

1. Workload-aware incremental repartitioning of shared-nothing distributed databases for scalable OLTP applications [J] . Joarder Kamal, Manzur Murshed, Rajkumar Buyya Future generation computer systems . 2016,第MARa期

机译：无工作负载的分布式数据库的工作负载感知增量重分区，可扩展的OLTP应用程序
2. Column Store for GWAC: A High-cadence, High-density, Large-scale Astronomical Light Curve Pipeline and Distributed Shared-nothing Database [J] . Wan Meng, Wu Chao, Wang Jing, Publications of the Astronomical Society of the Pacific . 2016,第969期

机译：GWAC的列存储：高节奏，高密度，大规模天文光曲线管道和分布式无共享数据库
3. Security and Scalability Measurement of Distributed Databases of Cloud Computing [J] . Amudha T., Sumithra Devi K. A., Saravanan K. International Journal of Applied Engineering Research . 2019,第2aPta2期

机译：云计算分布式数据库的安全性和可伸缩性测量
4. Workload-Aware Incremental Repartitioning of Shared-Nothing Distributed Databases for Scalable Cloud Applications [C] . Kamal Joarder Mohammad Mustafa, Murshed Manzur, Buyya Rajkumar IEEE/ACM International Conference on Utility and Cloud Computing . 2014

机译：工作负载感知增量重新分区的共享除云应用程序的共享除可数据库
5. Efficient and Scalable Metadata Access for Distributed Applications from Edge to the Cloud [D] . Zhang, Bing 2019

机译：从边缘到云的分布式应用程序的高效可扩展元数据访问
6. Distributed retrieval engine for the development of cloud-deployed biological databases [O] . David Bouzaglo, Israel Chasida, Elishai Ezra Tsur 2018

机译：分布式检索引擎用于开发部署了云的生物数据库
7. Workload-Aware Incremental Repartitioning of Shared-Nothing Distributed Databases for Scalable Cloud Applications [O] . Joarder Mohammad, Mustafa Kamal, Manzur Murshed, 2015

机译：针对可伸缩云应用程序的无共享分布式数据库的工作负载感知增量重新分区

Workload-Aware Incremental Repartitioning of Shared-Nothing Distributed Databases for Scalable Cloud Applications

摘要

著录项

相似文献

相关主题

期刊订阅