【24h】

Indexing Multi-dimensional Data in a Cloud System

机译:在云系统中索引多维数据

获取原文

摘要

Providing scalable database services is an essential requirement for extending many existing applications of the Cloud platform. Due to the diversity of applications, database services on the Cloud must support large-scale data analytical jobs and high concurrent OLTP queries. Most existing work focuses on some specific type of applications. To provide an integrated framework, we are designing a new system, epiC, as our solution to next-generation database systems. In epiC, indexes play an important role in improving overall performance. Different types of indexes are built to provide efficient query processing for different applications.In this paper, we propose RT-CAN, a multi-dimensional indexing scheme in epiC. RT-CAN integrates CAN [23]-based routing protocol and the R-tree based indexing scheme to support efficient multi-dimensional query processing in a Cloud system. RT-CAN organizes storage and compute nodes into an overlay structure based on an extended CAN protocol. In our proposal, we make a simple assumption that each compute node uses an R-tree like indexing structure to index the data that are locally stored. We propose a query-conscious cost model that selects beneficial local R-tree nodes for publishing. By keeping the number of persistently connected nodes small and maintaining a global multi-dimensional search index, we can locate the compute nodes that may contain the answer with a few hops, making the scheme scalable in terms of data volume and number of compute nodes. Experiments on Amazon's EC2 show that our proposed routing protocol and indexing scheme are robust, efficient and scalable.
机译:提供可伸缩的数据库服务是扩展Cloud Platform的许多现有应用程序的基本要求。由于应用程序的多样性,云上的数据库服务必须支持大规模数据分析作业和高并发OLTP查询。现有的大多数工作都集中在某些特定类型的应用程序上。为了提供一个集成的框架,我们正在设计一个新的系统epiC,作为我们对下一代数据库系统的解决方案。在epiC中,索引在提高整体性能方面起着重要作用。构建不同类型的索引可为不同的应用程序提供有效的查询处理。 在本文中,我们提出了RT-CAN,这是epiC中的多维索引方案。 RT-CAN集成了基于CAN [23]的路由协议和基于R-tree的索引方案,以支持Cloud系统中高效的多维查询处理。 RT-CAN基于扩展的CAN协议将存储和计算节点组织为覆盖结构。在我们的建议中,我们做出一个简单的假设,即每个计算节点都使用类似R树的索引结构来索引本地存储的数据。我们提出一个查询意识的成本模型,该模型选择有利的本地R-tree节点进行发布。通过使持久连接的节点数保持较小并保持全局多维搜索索引,我们可以通过几跳来定位可能包含答案的计算节点,从而使该方案在数据量和计算节点数方面具有可扩展性。在Amazon EC2上进行的实验表明,我们提出的路由协议和索引方案是可靠,高效和可扩展的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号