首页> 外文会议>IEEE/ACM International Conference on Utility and Cloud Computing >Workload-Aware Incremental Repartitioning of Shared-Nothing Distributed Databases for Scalable Cloud Applications
【24h】

Workload-Aware Incremental Repartitioning of Shared-Nothing Distributed Databases for Scalable Cloud Applications

机译:可扩展的云应用程序的无共享分布式数据库的工作负载感知增量分区

获取原文

摘要

Cloud applications often rely on shared-nothing distributed databases that can sustain rapid growth in data volume. Distributed transactions (DTs) that involve data tuples from multiple geo-distributed servers can adversely impact the performance of such databases, especially when the transactions are short-lived in and require immediate response. The k-way min-cut graph clustering algorithm has been found effective to reduce the number of DTs with acceptable level of load balancing. Benefits of such a static partitioning scheme, however, is short-lived in Cloud applications with dynamically varying workload patterns where DT profile changes over time. This paper addresses this emerging challenge by introducing incremental repartitioning. In each repartitioning cycle, DT profile is learnt online and k-way min-cut clustering algorithm is applied on a special sub-graph representing all DTs as well as those non-DTs that have at least one tuple in a DT. The latter ensures that the min-cut algorithm minimally reintroduces new DTs from the non-DTs while maximally transforming existing DTs into non-DTs in the new partitioning. Potential load imbalance risk is mitigated by applying the graph clustering algorithm on the finer logical partitions instead of the servers and relying on random one-to-one cluster-to-partition mapping that naturally balances out loads. Inter-server data-migration due to repartitioning is kept in check with two special mappings favouring the current partition of majority tuples in a cluster -- the many-to-one version minimising data migrations alone and the one-to-one version reducing data migration without affecting load balancing. A distributed data lookup process, inspired by the roaming protocol in mobile networks, is introduced to efficiently handle data migration without affecting scalability. The effectiveness of the proposed framework is evaluated on realistic TPC-C workloads comprehensively using graph, hyper graph, and compressed hyper gr- ph representations used in the literature. Simulation results convincingly support incremental repartitioning against static partitioning.
机译:云应用程序通常不依赖任何共享的分布式数据库,这些数据库可以维持数据量的快速增长。涉及来自多个地理分布式服务器的数据元组的分布式事务(DT)可能会对此类数据库的性能产生不利影响,尤其是在事务短暂且需要立即响应的情况下。已经发现k路最小割图聚类算法可以有效地减少DT的数量,并具有可接受的负载平衡级别。但是,这种静态分区方案的好处在具有动态变化的工作负载模式的Cloud应用程序中是短暂的,其中DT配置文件随时间变化。本文通过引入增量重新分区解决了这一新兴挑战。在每个重新分区周期中,都将在线学习DT配置文件,并在表示所有DT以及在DT中具有至少一个元组的那些非DT的特殊子图上应用k向最小割聚类算法。后者确保最小剪切算法从非DT最小限度地重新引入新DT,同时在新分区中最大程度地将现有DT转换为非DT。通过在更精细的逻辑分区而不是服务器上应用图聚类算法,并依靠自然地平衡负载的随机一对一群集到分区映射,可以减轻潜在的负载不平衡风险。通过两个特殊的映射来控制由于分区而导致的服务器间数据迁移,这两个映射有利于集群中大多数元组的当前分区-多对一版本仅减少数据迁移,而一对一版本减少数据迁移而不会影响负载平衡。引入了受移动网络中漫游协议启发的分布式数据查找过程,以在不影响可伸缩性的情况下有效地处理数据迁移。使用图,超图和文献中使用的压缩超图表示法,可以在实际的TPC-C工作负载上全面评估所提出框架的有效性。仿真结果令人信服地支持针对静态分区的增量重新分区。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号