Implementing data cube construction using a cluster middleware: algorithms, implementation experience, and performance evaluation

机译：使用集群中间件实现数据多维数据集施工：算法，实现经验和性能评估

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With increases in the amount of data available for analysis in commercial settings, On Line Analytical Processing (OLAP) and decision support have become important applications for high performance computing. Implementing such applications on clusters requires a lot of expertise and effort, particularly because of the sizes of input and output datasets. In this paper, we describe our experiences in developing one such application using a cluster middleware, called ADR. We focus on the problem of data cube construction, which commonly arises in multi-dimensional OLAP. We show how ADR, originally developed for scientific data intensive applications, can be used for carrying out an efficient and scalable data cube construction implementation. A particular issue with the use of ADR is tiling of output datasets. We present new algorithms that combine interprocessor communication and tiling within each processor. These algorithms preserve the important properties that are desirable from any parallel data cube construction algorithm. We have carried out a detailed evaluation of our implementation. The main results from our experiments are as follows: 1) High speedups are achieved on both dense and sparse datasets, even though we have used simple algorithms that sequentialize a part of the computation, 2) The execution time depends only upon the amount of computation, and does not increase in a super-linear fashion as the dataset size or the number of tiles increases, and 3) As the datasets become more sparse, sequential performance degrades, but the parallel speedups are still quite good.

机译：随着商业环境分析的数据量增加，在线分析处理（OLAP）和决策支持已成为高性能计算的重要应用。在集群中实施此类应用程序需要大量的专业知识和精力，特别是因为输入和输出数据集的大小。在本文中，我们描述了我们使用群集中间件开发一个这样的应用程序的经验，称为ADR。我们专注于数据立方体建设的问题，它通常出现在多维olap中。我们展示了ADR最初用于科学数据密集型应用的ADR，可用于执行高效且可扩展的数据CUBE施工实现。使用ADR的特定问题是输出数据集的平铺。我们呈现了新的算法，将empterocessor通信和平铺组合在每个处理器内。这些算法保留了从任何并行数据CUBE构造算法所需的重要特性。我们对我们的实施进行了详细的评估。我们的实验的主要结果如下：1）在密集和稀疏的数据集中实现了高速度，即使我们已经使用了顺序化计算的一部分的简单算法，2）执行时间仅取决于计算量取决于计算量，并且由于数据集大小或瓷砖数量增加，并且3）随着数据集变得更加稀疏，顺序性能降低，但并行加速仍然非常好。

著录项

来源
《IEEE/ACM International Symposium on Cluster Computing and the Grid》|2002年||共9页
会议地点
作者
Ge Yang; Ruoming Jin; Gagan Agrawal;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP338.8-53;
关键词
入库时间 2022-08-20 20:45:47

相似文献

外文文献
中文文献
专利

1. Implementing data cube construction using a cluster middleware: algorithms, implementation experience, and performance evaluation [J] . Ge Yang, Ruoming Jin, Gagan Agrawal Future generation computer systems . 2003,第4期

机译：使用集群中间件实现数据多维数据集构建：算法，实现经验和性能评估
2. A Distributed Data Implementation of the Perspective Shear-Warp Volume Rendering Algorithm for Visualisation of Large Astronomical Cubes [J] . Brett Beeson Publications of the Astronomical Society of Australia . 2013,第3期

机译：大天体多维数据集可视化的透视剪切变形体积渲染算法的分布式数据实现
3. Real-time Implementation of Obstacle Detection Algorithms on a Datacube MaxPCI Architecture [J] . Mau-Tsuen Yang, Tarak Gandhi, Rangachar Kasturi Real-Time Imaging . 2002,第2期

机译：Datacube MaxPCI架构上障碍物检测算法的实时实现
4. Implementing data cube construction using a cluster middleware: algorithms, implementation experience, and performance evaluation [C] . Ge Yang, Ruoming Jin, Gagan Agrawal IEEE/ACM International Symposium on Cluster Computing and the Grid . 2002

机译：使用集群中间件实现数据多维数据集施工：算法，实现经验和性能评估
5. High-performance cluster computing, algorithms, implementations and performance evaluation for computation-intensive applications to promote complex scientific research on turbulent flows. [D] . Wang, Hao. 2001

机译：面向计算密集型应用程序的高性能群集计算，算法，实现和性能评估，以促进对湍流的复杂科学研究。
6. Getting to implementation: a protocol for a Hybrid III stepped wedge cluster randomized evaluation of using data-driven implementation strategies to improve cirrhosis care for Veterans [O] . Shari S. Rogal, Vera Yakovchenko, Timothy Morgan, 2020

机译：进入实施方式：Hybrid III的协议阶梯式楔形集群随机评估使用数据驱动的实施策略来改善退伍军人的肝硬化护理
7. Implementing data cube construction using a cluster middleware: Algorithms, implementation experience, and performance evaluation [O] . Ge Yang, Ruoming Jin, Gagan Agrawaldepartment Computer, 2002

机译：使用集群中间件实现数据多维数据集构建：算法，实现经验和性能评估
8. Performance studies of the multigrid algorithms implemented on hypercube multiprocessor systems [R] . Naik, Vijay K., Taasan, Shlomo 1987

机译：在超立方体多处理器系统上实现的多重网格算法的性能研究

Implementing data cube construction using a cluster middleware: algorithms, implementation experience, and performance evaluation

摘要

著录项

相似文献

相关主题

期刊订阅