首页> 外文会议>IEEE International conference on cluster computing >High-performance X-ray tomography reconstruction algorithm based on heterogeneous accelerated computing systems
【24h】

High-performance X-ray tomography reconstruction algorithm based on heterogeneous accelerated computing systems

机译:基于异构加速计算系统的高性能X射线层析成像重建算法

获取原文

摘要

Many medical image processing applications need high processing speed to achieve almost real-time image reconstruction features. Due to that, massively parallel architectures based on accelerators have become very popular in the area, specially GPGPUs. In this paper we show Mangoose++, an application to perform X-Ray Computed Tomography (CT) from medical image based on a new implementation of the FDK algorithm. Mangoose++ have been designed and implemented to exploit the parallelism existing on several hardware accelerators platforms, as GPGPUs and Intel Xeon Phi accelerators. In this paper we show the design and implementation of the application in three types of platforms, multi-core CPU, GPGPU, and Intel Xeon Phi, and the evaluation made to test the performance, resource utilization, and scalability of each platform. Moreover, to avoid hardware dependencies, we have also implemented the application using the OpenACC runtime to check portability and the overhead incurred when using runtimes. The evaluation results show that our solution is faster than recent related works and that, in terms of computation, Intel Xeon Phi and the CUDA-based GPU versions obtain similar results as the problem size increases. Moreover, the evaluation shows that using OpenACC, we have enhanced programmability because there is a single version of the source code. But it also shows that using OpenACC heavily affects performance of Mangoose++, which is reduced in a 50% when compared with the many-core versions, even when it is not so drastical when compared to the CPU version.
机译:许多医学图像处理应用程序需要很高的处理速度才能实现几乎实时的图像重建功能。因此,基于加速器的大规模并行架构在该领域非常流行,尤其是GPGPU。在本文中,我们展示了Mangoose ++,它是一种基于FDK算法新实现的从医学图像执行X射线计算机断层扫描(CT)的应用程序。 Mangoose ++的设计和实现是为了利用GPGPU和Intel Xeon Phi加速器等几种硬件加速器平台上存在的并行性。在本文中,我们展示了在三种类型的平台(多核CPU,GPGPU和Intel Xeon Phi)中应用程序的设计和实现,并进行了评估以测试每个平台的性能,资源利用率和可伸缩性。此外,为了避免硬件依赖性,我们还使用OpenACC运行时实现了应用程序,以检查可移植性和使用运行时时产生的开销。评估结果表明,我们的解决方案比最近的相关工作要快,并且在计算方面,随着问题规模的扩大,英特尔至强融核和基于CUDA的GPU版本获得了相似的结果。此外,评估显示,使用OpenACC,由于源代码只有一个版本,因此我们增强了可编程性。但是它还表明,使用OpenACC会对Mangoose ++的性能产生重大影响,与多核版本相比,Mangoose ++的性能降低了50%,即使与CPU版本相比,它却没有那么激烈。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号