首页> 外文会议>IEEE International Conference on Image Processing >Fast and Parallel Computation of the Discrete Periodic Radon Transform on GPUs, multi-core CPUs and FPGAs
【24h】

Fast and Parallel Computation of the Discrete Periodic Radon Transform on GPUs, multi-core CPUs and FPGAs

机译:在GPU,多核CPU和FPGA上的离散定期氡变换的快速和并行计算

获取原文

摘要

The Discrete Periodic Radon Transform (DPRT) has many important applications in reconstructing images from their projections and has recently been used in fast and scalable architectures for computing 2D convolutions. Unfortunately, the direct computation of the DPRT involves O(N~3) additions and memory accesses that can be very costly in single-core architectures. The current paper presents new and efficient algorithms for computing the DPRT and its inverse on multi-core CPUs and GPUs. The results are compared against specialized hardware implementations (FPGAs/ASICs). The results provide significant evidence of the success of the new algorithms. On an 8-core CPU (Intel Xeon), with support for two threads per core, FastDirDPRT and FastDirInvDPRT achieve a speedup of approximately 10× (up to 12.83×) over the single-core CPU implementation. On a 2048-core GPU (GTX 980), FastRayDPRT and FastRayInvDPRT achieve speedups in the range of 526 (for 127 × 127) to 873 (for 1021 × 1021), which approximate ideal speedups of what can be achieved. The DPRT can be computed exactly and in real-time (30 frames per second) for 1471 × 1471 images using FastRayDPRT on the GPU. Furthermore, the GPU algorithms approximate the performance of an efficient FPGA implementation using 2N parallel cores at 100MHz.
机译:离散定期氡变换(DPRT)在重建预测图像中具有许多重要应用,并且最近已被用于用于计算2D卷积的快速和可扩展的架构中。不幸的是,DPRT的直接计算涉及O(n〜3)的添加和存储器访问,可以在单核架构中非常昂贵。本文介绍了用于计算DPRT及其在多核CPU和GPU上的逆势的新型和高效的算法。将结果与专用硬件实现进行比较(FPGA / ASIC)。结果提供了新算法成功的重要证据。在8核CPU(Intel Xeon)上,支持每个核心的两个线程,FastDirdPrt和FastDirinVDPRT通过单核CPU实现实现大约10倍(最多12.83倍)的加速。在2048核心GPU(GTX 980)上,FastRayDPRT和FastrayInvdprt在526(127×127)到873(1021×1021)的范围内实现了速度,这近似了可以实现的内容的理想加速。可以在GPU上使用FastRayDprt在1471×1471图像中完全且实时(每秒30帧)计算DPRT。此外,GPU算法近似于在100MHz处使用2N并联核的高效FPGA实现的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号