首页> 外文期刊>Journal of Parallel and Distributed Computing >Accelerating text mining workloads in a MapReduce-based distributed GPU environment
【24h】

Accelerating text mining workloads in a MapReduce-based distributed GPU environment

机译:在基于MapReduce的分布式GPU环境中加速文本挖掘工作负载

获取原文
获取原文并翻译 | 示例

摘要

Scientific computations have been using GPU-enabled computers successfully, often relying on distributed nodes to overcome the limitations of device memory. Only a handful of text mining applications benefit from such infrastructure. Since the initial steps of text mining are typically data intensive, and the ease of deployment of algorithms is an important factor in developing advanced applications, we introduce a flexible, distributed, MapReduce-based text mining workflow that performs I/0-bound operations on CPUs with industry-standard tools and then runs compute-bound operations on GPUs which are optimized to ensure coalesced memory access and effective use of shared memory. We have performed extensive tests of our algorithms on a cluster of eight nodes with two NVidia Tesla M2050s attached to each, and we achieve considerable speedups for random projection and self-organizing maps.
机译:科学计算已经成功地使用了启用GPU的计算机,通常依靠分布式节点来克服设备内存的限制。这样的基础结构仅使少数文本挖掘应用程序受益。由于文本挖掘的初始步骤通常需要大量数据,并且算法的易于部署是开发高级应用程序的重要因素,因此,我们引入了基于MapReduce的灵活,分布式,基于文本挖掘的工作流,该工作流在I / 0上执行操作具有行业标准工具的CPU,然后在经过优化的GPU上运行计算绑定操作,以确保合并的内存访问和有效使用共享内存。我们在八个节点的群集上对我们的算法进行了广泛的测试,每个节点上都连接了两个NVidia Tesla M2050,我们为随机投影和自组织地图实现了可观的加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号