首页> 外文OA文献 >High Availability for Database Systems in Geographically Distributed Cloud Computing Environments
【2h】

High Availability for Database Systems in Geographically Distributed Cloud Computing Environments

机译:地理分布的云计算环境中数据库系统的高可用性

摘要

In recent years, cloud storage systems have become very popular due to their good scal-ability and high availability. However, these storage systems provide limited transactional capabilities, which makes developing applications that use these systems substantially more difficult than developing applications that use a traditional SQL-based relational database management systems (DBMS). There have been solutions that provide transactional SQL-based DBMS services on the cloud, including solutions that use cloud shared storage systems to store the data. However, none of these solutions take advantage of the shared cloud storage architecture to provide DBMS high availability. These solutions typically deal with the failure of a DBMS server by restarting this server and going through crash recovery based on the transaction log, which can lead to long DBMS service downtimes that are not acceptable to users. It is possible to run traditional DBMS high availability solutions in cloud environments. These solutions are typically based on shipping the transaction log from a primary server to a backup server, and replaying the log at the backup server to keep it up to date with the primary. However, these solutions do not work well if the primary and backup are in different, geographically distributed data centers due to the high latency of log shipping. Furthermore, these solutions do not take advantage of the capabilities of the underlying shared storage system.We present a new transparent high availability system for transactional SQL-basedDBMS on a shared storage architecture, which we call CAC-DB (Continuous Access Cloud DataBase). Our system is especially designed for eventually consistent cloud storage systems that run efficiently in multiple geographically distributed data centers. The database and transaction logs are stored in such a storage system, and therefore remain available after a failure up to the failure of an entire data center (e.g., in a natural disaster). CAC-DB takes advantage of this shared storage to ensure that the DBMS service remains available and transactionally consistent in the face of failures up to the loss of one or more data centers. By taking advantage of shared storage, CAC-DB can run in a geographically distributed environment with minimal overhead as compared to traditional log shipping solutions.In CAC-DB, an active (primary) and a standby (backup) DBMS run on different serversin different data centers. The standby catches up with the active's memory state by replaying the shared log. When the active crashes, the standby can finish the failover process and reach peak throughput very quickly. The DBMS service only experiences several seconds of downtime. While the basic idea of replaying the log is simple and not new, the shared storage environment poses many new challenges including the need for synchronization protocols, new buffer pool management mechanisms, approaches for guaranteeing strong consistency without sacrifi cing performance and new shared storage based failure detection mechanism. This thesis solves these challenges and presents a system that achieves the following goal: if a data center fails, not only does the persistent image of the database on the storage tier survive, but also the DBMS service can resume almost uninterrupted and reach peak throughput in a very short time. At the same time, the throughput of the DBMS service in normal processing is not negatively affected. Our experiments with CAC-DB running on EC2 con rm that it can achieve the above goals.
机译:近年来,由于其良好的可伸缩性和高可用性,云存储系统已变得非常流行。但是,这些存储系统提供有限的事务处理能力,这使得开发使用这些系统的应用程序比开发使用基于传统SQL的关系数据库管理系统(DBMS)的应用程序更加困难。有些解决方案可以在云上提供基于事务的基于SQL的DBMS服务,包括使用云共享存储系统来存储数据的解决方案。但是,这些解决方案均未利用共享云存储架构来提供DBMS高可用性。这些解决方案通常通过重新启动服务器并根据事务日志进行崩溃恢复来处理DBMS服务器的故障,这可能导致长时间的DBMS服务停机,这是用户无法接受的。可以在云环境中运行传统的DBMS高可用性解决方案。这些解决方案通常基于将事务日志从主服务器传送到备份服务器,并在备份服务器上重播日志以使其与主服务器保持最新。但是,由于日志传送的高延迟,如果主数据库和备份数据库位于不同的,地理位置分散的数据中心,则这些解决方案将无法正常工作。此外,这些解决方案没有利用底层共享存储系统的功能。我们在共享存储体系结构上为基于事务的SQL的DBMS提供了一个新的透明高可用性系统,我们将其称为CAC-DB(连续访问云数据库)。我们的系统是专为最终一致的云存储系统而设计的,该系统可以在多个地理分布的数据中心中高效运行。数据库和事务日志存储在这样的存储系统中,因此在发生故障直到整个数据中心发生故障(例如在自然灾害中)时仍然可用。 CAC-DB利用此共享存储来确保DBMS服务在发生故障(直至丢失一个或多个数据中心)的情况下仍保持可用并在事务上保持一致。与传统的日志传送解决方案相比,通过利用共享存储,CAC-DB可以在地理上分散的环境中以最小的开销运行。在CAC-DB中,活动(主)DBMS和备用(备份)DBMS在不同的服务器上运行数据中心。备用数据库通过重播共享日志来赶上主数据库的内存状态。当主服务器崩溃时,备用服务器可以完成故障转移过程,并很快达到峰值吞吐量。 DBMS服务仅经历几秒钟的停机时间。尽管重播日志的基本思想很简单,但是共享存储环境带来了许多新挑战,包括对同步协议的需求,新的缓冲池管理机制,在不牺牲性能的情况下保证强一致性的方法以及基于共享的新故障检测机制。本文解决了这些挑战,并提出了一个实现以下目标的系统:如果数据中心发生故障,则存储层上的数据库持久映像不仅可以生存,而且DBMS服务几乎可以不间断地恢复并达到峰值吞吐量。很短的时间。同时,正常处理中DBMS服务的吞吐量不会受到负面影响。我们在EC2上运行的CAC-DB的实验表明,它可以实现上述目标。

著录项

  • 作者

    Meng Huangdong;

  • 作者单位
  • 年度 2014
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号