首页> 外文会议>International symposium on computer architecture and high performance computing >Efficient Execution of Microscopy Image Analysis on CPU, GPU, and MIC Equipped Cluster Systems
【24h】

Efficient Execution of Microscopy Image Analysis on CPU, GPU, and MIC Equipped Cluster Systems

机译:在配备CPU,GPU和MIC的群集系统上高效执行显微镜图像分析

获取原文

摘要

High performance computing is experiencing a major paradigm shift with the introduction of accelerators, such as graphics processing units (GPUs) and Intel Xeon Phi (MIC). These processors have made available a tremendous computing power at low cost, and are transforming machines into hybrid systems equipped with CPUs and accelerators. Although these systems can deliver a very high peak performance, making full use of its resources in real-world applications is a complex problem. Most current applications deployed to these machines are still being executed in a single processor, leaving other devices underutilized. In this paper we explore a scenario in which applications are composed of hierarchical dataflow tasks which are allocated to nodes of a distributed memory machine in coarse-grain, but each of them may be composed of several finer-grain tasks which can be allocated to different devices within the node. We propose and implement novel performance aware scheduling techniques that can be used to allocate tasks to devices. We evaluate our techniques using a pathology image analysis application used to investigate brain cancer morphology, and our experimental evaluation shows that the proposed scheduling strategies significantly outperforms other efficient scheduling techniques, such as Heterogeneous Earliest Finish Time - HEFT, in cooperative executions using CPUs, GPUs, and Masc. also experimentally show that our strategies are less sensitive to inaccuracy in the scheduling input data and that the performance gains are maintained as the application scales.
机译:高性能计算正在经历重大的范式转变,其中引入了加速器,例如图形处理单元(GPU)和英特尔至强融核(MIC)。这些处理器以低成本提供了巨大的计算能力,并且正在将机器转变为配备有CPU和加速器的混合系统。尽管这些系统可以提供非常高的峰值性能,但是在实际应用中充分利用其资源却是一个复杂的问题。部署到这些计算机上的大多数当前应用程序仍在单个处理器中执行,从而导致其他设备的利用不足。在本文中,我们探讨了一种方案,其中应用程序由分层数据流任务组成,这些任务以粗粒度分配给分布式存储计算机的节点,但是每个应用程序可能由几个细粒度任务组成,这些细粒度任务可以分配给不同的任务节点内的设备。我们提出并实现了可用于将任务分配给设备的新颖的性能感知调度技术。我们使用用于研究脑癌形态的病理图像分析应用程序评估了我们的技术,我们的实验评估表明,在使用CPU,GPU的协作执行中,所提出的调度策略明显优于其他有效的调度技术,例如异构最早完成时间-HEFT和Masc。还通过实验表明,我们的策略对调度输入数据中的不准确性较不敏感,并且随着应用程序的扩展,性能提升得以保持。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号