Accelerating text mining workloads in a MapReduce-based distributed GPU environment

Peter Wittek; Sandor Daranyi

首页> 外文期刊>Journal of Parallel and Distributed Computing >Accelerating text mining workloads in a MapReduce-based distributed GPU environment

【24h】

Accelerating text mining workloads in a MapReduce-based distributed GPU environment

机译：在基于MapReduce的分布式GPU环境中加速文本挖掘工作负载

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Scientific computations have been using GPU-enabled computers successfully, often relying on distributed nodes to overcome the limitations of device memory. Only a handful of text mining applications benefit from such infrastructure. Since the initial steps of text mining are typically data intensive, and the ease of deployment of algorithms is an important factor in developing advanced applications, we introduce a flexible, distributed, MapReduce-based text mining workflow that performs I/0-bound operations on CPUs with industry-standard tools and then runs compute-bound operations on GPUs which are optimized to ensure coalesced memory access and effective use of shared memory. We have performed extensive tests of our algorithms on a cluster of eight nodes with two NVidia Tesla M2050s attached to each, and we achieve considerable speedups for random projection and self-organizing maps.

机译：科学计算已经成功地使用了启用GPU的计算机，通常依靠分布式节点来克服设备内存的限制。这样的基础结构仅使少数文本挖掘应用程序受益。由于文本挖掘的初始步骤通常需要大量数据，并且算法的易于部署是开发高级应用程序的重要因素，因此，我们引入了基于MapReduce的灵活，分布式，基于文本挖掘的工作流，该工作流在I / 0上执行操作具有行业标准工具的CPU，然后在经过优化的GPU上运行计算绑定操作，以确保合并的内存访问和有效使用共享内存。我们在八个节点的群集上对我们的算法进行了广泛的测试，每个节点上都连接了两个NVidia Tesla M2050，我们为随机投影和自组织地图实现了可观的加速。

著录项

来源
《Journal of Parallel and Distributed Computing》 |2013年第2期|198-206|共9页
作者
Peter Wittek; Sandor Daranyi;
展开▼
作者单位

Swedish School of Library and Information Science, University of Boras, Allegatan 1, Boras, S-501 90, Sweden;

Swedish School of Library and Information Science, University of Boras, Boras, Sweden;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
GPU computing; mapreduce; text mining; self-organizing maps; random projection;

机译：GPU计算;减少文本挖掘;自组织地图;随机投影;

相似文献

外文文献
中文文献
专利

1. One size does not fit all: accelerating OLAP workloads with GPUs [J] . Zhang Yansong, Zhang Yu, Lu Jiaheng, Distributed and Parallel Databases . 2020,第4期

机译：一种尺寸不适合所有：使用GPU加速OLAP工作负载
2. Hengam a MapReduce-Based Distributed Data Warehouse for Big Data: A MapReduce-Based Distributed Data Warehouse for Big Data [J] . Mohammadhossein Barkhordari, Mahdi Niamanesh International journal of artificial life research . 2018,第1期

机译：Hengam基于MapReduce的大数据分布式数据仓库：基于MapReduce的大数据分布式数据仓库
3. Accelerating exact and approximate inference for (distributed) discrete optimization with GPUs [J] . Fioretto Ferdinando, Pontelli Enrico, Yeoh William, Constraints . 2018,第1期

机译：使用GPU加速（分布式）离散优化的精确和近似推断
4. GPU accelerated Chemical Text mining for relationship identification between chemical entities in heterogeneous environment [C] . Mita A. Landge, K. Rajeswari International Conference on Computing, Communication, Control and Automation . 2016

机译：GPU加速化学文本挖掘，用于异构环境中化学实体之间的关系识别
5. Algorithmic and software system support to accelerate data processing in CPU-GPU hybrid computing environments. [D] . Wang, Kaibo. 2015

机译：算法和软件系统支持可加速CPU-GPU混合计算环境中的数据处理。
6. A Distributed Look-up Architecture for Text Mining Applications using MapReduce [O] . Atilla Soner Balkir, Ian Foster, Andrey Rzhetsky -1

机译：分布式查询体系结构使用mapReduce的文本挖掘应用
7. Accelerating Text Mining Workloads in a MapReduce-based Distributed GPU Environment [O] . Wittek, Peter, Darányi, Sándor 2013

机译：在基于MapReduce的分布式GPU环境中加快文本挖掘工作量

Accelerating text mining workloads in a MapReduce-based distributed GPU environment

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅