首页> 外文会议>2010 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum >AUTO-GC: Automatic translation of data mining applications to GPU clusters

【24h】

AUTO-GC: Automatic translation of data mining applications to GPU clusters

机译：AUTO-GC：将数据挖掘应用程序自动转换为GPU集群

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Because of the very favorable price to performance ratio of the GPUs, a popular parallel programming configuration today is a cluster of GPUs. However, extracting performance on such a configuration would typically require programming in both MPI and CUDA, thus requiring a high degree of expertise and effort. It is clearly desirable to be able to support higher-level programming of this emerging high-performance computing platform. This paper reports on a code generation system that can translate data mining applications on a GPU cluster. Our work is driven by the observation that a common processing structure, that of generalized reductions, fits a large number of popular data mining algorithms. In our solution, the programmers simply need to specify the sequential reduction loop(s) with some additional information about the parameters. We use program analysis and code generation to automatically map the applications to the API of FREERIDE, which is a middleware for parallel data mining. We also automatically generate CUDA code for using the GPU on each node of the cluster. We have evaluated our system using two popular data mining applications, k-means clustering and Principal Component Analysis (PCA). We observed good scalability over the number of computing nodes, and the automatically generated version did not have any noticeable overheads compared to hand written codes. The speedup obtained by using GPU over using only the CPU on each node of a cluster is between 3 and 21.

机译：由于GPU的绩效比例非常有利，今天的流行并行编程配置是GPU集群。然而，在这种配置上提取性能通常需要在MPI和CUDA中进行编程，从而需要高度的专业知识和努力。清楚地希望能够支持该新兴的高性能计算平台的更高级别编程。本文报告了一个代码生成系统，可以在GPU集群上翻译数据挖掘应用程序。我们的工作是由观察到的，即广义减少的公共处理结构适合大量流行的数据挖掘算法。在我们的解决方案中，程序员只需用关于参数的一些附加信息指定顺序缩减循环。我们使用程序分析和代码生成来自动将应用程序映射到Freeride的API，这是一个用于并行数据挖掘的中间件。我们还会自动生成CUDA代码，以在群集中的每个节点上使用GPU。我们使用两个流行的数据挖掘应用程序评估了我们的系统，K-Means群集和主成分分析（PCA）。我们观察到计算节点数量的良好可扩展性，并且与手写代码相比，自动生成的版本没有任何明显的开销。通过在群集中的每个节点上仅使用GPU而通过使用GPU获得的加速度在3到21之间。

著录项

来源
《2010 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum 》|2010年|p.1-8|共8页
会议地点
作者
Wenjing Ma; Agrawal G.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程 ;
关键词

相似文献

外文文献
中文文献
专利

1. A Compiler and Runtime System for Enabling Data Mining Applications on GPUs [J] . Ma WJ, Agrawal G ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2009 ,第4期

机译：用于在GPU上启用数据挖掘应用程序的编译器和运行时系统
2. Study of parallel processing area extraction and data transfer number reduction for automatic GPU offloading of IoT applications [J] . Yamato Yoji Journal of Intelligent Information Systems . 2020 ,第3期

机译：SINAL ACPLACE的并行处理区域提取和数据传输数减少的研究
3. Mining diversified association rules in big datasets: A cluster/GPU/genetic approach [J] . Djenouri Youcef, Belhadi Asma, Fournier-Viger Philippe, Information Sciences: An International Journal . 2018 ,第期

机译：挖掘大数据集中多元化的关联规则：群集/ GPU /遗传方法
4. AUTO-GC: Automatic Translation of Data Mining Applications to GPU Clusters [C] . Wenjing Ma, Gagan Agrawal IEEE International Symposium on Parallel and Distributed Processing;Heterogeneity in Computing Workshop;Reconfigurable Architectures Workshop;Workshop on Multi-Threaded Architectures and Applications;Workshop on High-Level Parallel Programming Models Supportive Environments;Workshop on Nature Inspired Distributed Computing;Workshop on High Performance Computational Biology;Advances in Parallel and Distributed Computing Models;Workshop on System Management Techniques, Processes, and Services;Workshop on Parallel and Distributed Scientific and Engineering Computing;Workshop on Parallel and Distributed Computing in Finance;Workshop on Large-Scale Parallel Processing . 2010

机译：Auto-GC：将数据挖掘应用程序的自动转换为GPU集群
5. Automatic transformation and optimization of applications on GPUs and GPU clusters. [D] . Ma, Wenjing. 2011

机译：在GPU和GPU群集上自动转换和优化应用程序。
6. Data mining application to healthcare fraud detection: a two-step unsupervised clustering method for outlier detection with administrative databases [O] . Michela Carlotta Massi, Francesca Ieva, Emanuele Lettieri 2020

机译：数据挖掘应用于医疗保健欺诈检测：使用管理数据库的异常值检测的两步无监督群集方法
7. Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications [O] . Rakesh Agrawal, Johannes Gehrke, Dimitrios Gunopulos, 1998

机译：用于数据挖掘应用的高维数据的自动子空间聚类

AUTO-GC: Automatic translation of data mining applications to GPU clusters

摘要

著录项

相似文献

相关主题

期刊订阅