首页> 外文学位 >Availability, scalability and cost-effectiveness of cluster-based Internet infrastructures.
【24h】

Availability, scalability and cost-effectiveness of cluster-based Internet infrastructures.

机译:基于群集的Internet基础结构的可用性,可伸缩性和成本效益。

获取原文
获取原文并翻译 | 示例

摘要

Clusters of commodity computers are a cost-effective system structure for large-scale Internet services. Availability and scalability are two major concerns in the design of such a system. My dissertation examines the opportunities in the data storage systems for improving the availability and scalability of cluster-based Internet infrastructures at a low cost. The goal of availability is to maximize the percentage of client requests that succeed despite the failure of one or more servers in the cluster. The goal of scalability is to efficiently scale the server throughput with the cluster size. My basic approach is to investigate the data distribution strategies across nodes in the cluster, i.e. how to partition and replicate data on disk or in memory in order to achieve high availability and scalability.; Maintaining availability in the face of failures is a critical requirement for Internet services. Existing approaches in cluster-based data storage rely on redundancy to survive a small number of failures, but the system becomes largely unavailable if more failures occur. I study a failure isolation approach that partitions and replicates data and metadata across cluster nodes in such a way that the server in each node can deliver data to clients independently of the failures in other nodes. This approach is complementary to existing redundancy-based methods: redundancy can mask the first few failures, and failure isolation can take over and maintain availability for the majority of clients if more failures occur.; I also study how to improve the performance of Internet application servers in a cost-effective way by using a cluster of in-memory databases as the cache for dynamic content. In particular, I investigate how to dynamically partition and replicate data across individual databases in the cluster and how to direct queries to the right databases in order to maximize effective cache capacity and minimize synchronization cost. Despite the conflicts across queries for dynamic content, I observe natural query affinity in a wide range of Internet applications, which could be exploited in management strategies.
机译:商用计算机集群是用于大规模Internet服务的经济高效的系统结构。可用性和可伸缩性是此类系统设计中的两个主要问题。本文研究了数据存储系统中以低成本提高基于群集的Internet基础结构的可用性和可伸缩性的机会。可用性的目标是最大程度地提高尽管群集中一台或多台服务器出现故障而成功执行的客户端请求的百分比。可伸缩性的目标是根据群集大小有效地扩展服务器吞吐量。我的基本方法是研究集群中各节点之间的数据分配策略,即如何在磁盘或内存中对数据进行分区和复制,以实现高可用性和可伸缩性。面对故障保持可用性是Internet服务的关键要求。基于群集的数据存储中的现有方法依赖于冗余以保留少量故障,但是如果发生更多故障,则系统将变得不可用。我研究了一种故障隔离方法,该方法可以跨群集节点对数据和元数据进行分区和复制,以使每个节点中的服务器都可以独立于其他节点中的故障将数据传递给客户端。这种方法是对现有基于冗余的方法的补充:冗余可以掩盖前几个故障,并且如果发生更多的故障,故障隔离可以接管并维护大多数客户端的可用性。我还研究了如何通过使用内存数据库集群作为动态内容的缓存来以经济有效的方式提高Internet应用服务器的性能。特别是,我研究了如何在集群中的各个数据库之间动态分区和复制数据,以及如何将查询定向到正确的数据库,以最大程度地提高有效缓存容量并最小化同步成本。尽管对动态内容的查询存在冲突,但是我观察到了广泛的Internet应用程序中的自然查询相似性,可以在管理策略中加以利用。

著录项

  • 作者

    Ji, Minwen.;

  • 作者单位

    Princeton University.;

  • 授予单位 Princeton University.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2001
  • 页码 118 p.
  • 总页数 118
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号