...
首页> 外文期刊>Distributed and Parallel Databases >Distributed arrays: an algebra for generic distributed query processing
【24h】

Distributed arrays: an algebra for generic distributed query processing

机译:分布式阵列:用于通用分布式查询处理的代数

获取原文
   

获取外文期刊封面封底 >>

       

摘要

We propose a simple model for distributed query processing based on the concept of a distributed array. Such an array has fields of some data type whose values can be stored on different machines. It offers operations to manipulate all fields in parallel within the distributed algebra. The arrays considered are one-dimensional and just serve to model a partitioned and distributed data set. Distributed arrays rest on a given set of data types and operations called the basic algebra implemented by some piece of software called the basic engine. It provides a complete environment for query processing on a single machine. We assume this environment is extensible by types and operations. Operations on distributed arrays are implemented by one basic engine called the master which controls a set of basic engines called the workers. It maps operations on distributed arrays to the respective operations on their fields executed by workers. The distributed algebra is completely generic: any type or operation added in the extensible basic engine will be immediately available for distributed query processing. To demonstrate the use of the distributed algebra as a language for distributed query processing, we describe a fairly complex algorithm for distributed density-based similarity clustering. The algorithm is a novel contribution by itself. Its complete implementation is shown in terms of the distributed algebra and the basic algebra. As a basic engine the Secondo system is used, a rich environment for extensible query processing, providing useful tools such as main memory M-trees, graphs, or a DBScan implementation.
机译:我们提出了一种基于分布式数组概念的分布式查询处理的简单模型。这样的数组具有一些数据类型的字段,其值可以存储在不同的机器上。它提供操作以在分布式代数内并行地操作所有字段。所考虑的阵列是一维的,只是用于模拟分区和分布式数据集。分布式阵列在给定的一组数据类型和操作组上,称为由某些名为基本引擎的软件实现的基本代数的操作。它为单台机器提供了一个完整的查询处理环境。我们假设此环境通过类型和操作可扩展。分布式阵列的操作由一个名为主机的基本引擎来实现,该引擎控制一组名为工人的基本引擎。它将分布式阵列的操作映射到工人执行的字段的各个操作。分布式代数是完全通用的:可扩展基本引擎中添加的任何类型或操作都将立即用于分布式查询处理。为了证明使用分布式代数作为用于分布式查询处理的语言,我们描述了一种相当复杂的基于分布式密度的相似性聚类算法。该算法本身是一种新颖的贡献。其完整的实现是以分布式代数和基本代数而言的。作为一个基本发动机,使用Swita系统,用于可扩展查询处理的丰富环境,提供有用的工具,如主内存M树,图形或DBSCAN实现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号