Research and implementation of a high performance parallelrncomputing digital down converter on graphics processing unit

Guo‐lin Shao; Xing‐shu Chen; Lu Yang

首页> 外文期刊>Concurrency and computation: practice and experience >Research and implementation of a high performance parallelrncomputing digital down converter on graphics processing unit

【24h】

Research and implementation of a high performance parallelrncomputing digital down converter on graphics processing unit

机译：图形处理单元上高性能并行计算数字下变频器的研究与实现

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Digital down converter (DDC) is a time‐intensive and data‐intensive computing task andrnconsidered as the key technology in software defined radio. This paper proposes a highperformancernimplementation of DDC on a graphics processing unit (GPU) using CUDA, whichrnis composed of a numerically controlled oscillator stage, a cascaded integrator‐comb (CIC)rndecimation filter stage, and a finite impulse response (FIR) filter stage. The GPU implementationrnand optimizing of all the stages are studied in detail. Additionally, for handling a long‐duration signal,rnthe signal data sequence is truncated into segments; the overlap‐save and overlap‐add mechanismsrnwere applied in CIC stage and FIR stage, respectively. Finally, experiments werernconducted to evaluate the performance of GPU‐based DDC with respect to a sequential versionrnCPU implementation and an OpenMP implementation (16 threads). Experimental results demonstraternthat the DDC achieves significant improvements on the GPU; the maximum speed ups inrnnumerically controlled oscillator stage, CIC stage, and FIR stage can achieve more than 1242,rn527, and 179 times, including data‐transfer, kernel execution, and other processing operations;rnthe overall speed up of DDC can achieve more than 180. In the meantime, the speed ups ofrnGPU implementation are far above the OpenMP implementation (about 2.5‐6.4 times).

机译：数字下变频器（DDC）是一项耗时且数据密集的计算任务，被视为软件无线电中的关键技术。本文提出了使用CUDA在图形处理单元（GPU）上实现DDC的高性能方法，该电路由数控振荡器级，级联积分梳状（CIC）抽取滤波器级和有限冲激响应（FIR）滤波器级组成。详细研究了各个阶段的GPU实现和优化。另外，为了处理长时间的信号，信号数据序列会被截断成段。在CIC阶段和FIR阶段分别应用了重叠保存和重叠添加机制。最后，进行了实验以评估基于GPU的DDC相对于顺序版本的CPU实现和OpenMP实现（16个线程）的性能。实验结果表明，DDC在GPU上取得了显着改进。数字控制振荡器级，CIC级和FIR级的最大提速可以达到1242，rn527和179倍以上，包括数据传输，内核执行和其他处理操作; DDC的总体提速可以达到180.同时，rnGPU实现的速度远远高于OpenMP实现（约2.5-6.4倍）。

著录项

来源
《Concurrency and computation: practice and experience》 |2017年第8期|1-16|共16页
作者
Guo‐lin Shao; Xing‐shu Chen; Lu Yang;
展开▼
作者单位

Department of Computer Science, SichuanUniversity, Chengdu, Sichuan 610065, China;

Department of Computer Science, SichuanUniversity, Chengdu, Sichuan 610065, China;

Department of Computer Science, SichuanUniversity, Chengdu, Sichuan 610065, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
cascaded integrator‐comb (CIC) decimation filter; CUDA; digital down converter; finite impulsernresponse (FIR) filter; GPU implementation; numerically controlled oscillator (NCO;

机译：级联积分梳状（CIC）抽取滤波器;CUDA;数字下变频器;有限冲激响应（FIR）滤波器;GPU实施;数控振荡器（NCO;

相似文献

外文文献
中文文献
专利

1. High performance direct gravitational N-body simulations on graphics processing units II: An implementation in CUDA [J] . Belleman RG, Bedorf J, Zwart SFP New astronomy . 2008,第2期

机译：图形处理单元上的高性能直接重力N体仿真II：CUDA中的实现
2. Multicore Processors and Graphics Processing Unit Accelerators for Parallel Retrieval of Aerosol Optical Depth From Satellite Data: Implementation, Performance, and Energy Efficiency [J] . Liu Jia, Feld Dustin, Xue Yong, Selected Topics in Applied Earth Observations and Remote Sensing, IEEE Journal of . 2015,第5期

机译：从卫星数据并行检索气溶胶光学深度的多核处理器和图形处理单元加速器：实现，性能和能效
3. Implementation and performance of a general purpose graphics processing unit in hyperspectral image analysis [J] . H.M.A. van der Werff, W.H. Bakker International journal of applied earth observation and geoinformation . 2014,第Null期

机译：通用图形处理单元在高光谱图像分析中的实现与性能
4. Implementation of a digital down converter using graphics processing unit [C] . Xiao Ma, Lixia Deng, Yuping Zhao IEEE International Conference on Communication Technology . 2013

机译：使用图形处理单元实现数字下变频器
5. Digital Signal Processing Algorithms Implemented on Graphics Processing Units and Software Development for Phased Array Receiver Systems [D] . Ruzindana, Mark William. 2021

机译：用于分阶段阵列接收系统的图形处理单元和软件开发中实现的数字信号处理算法
6. Graphics Processing Unit (GPU) implementation of image processing algorithms to improve system performance of the Control Acquisition Processing and Image Display System (CAPIDS) of the Micro-Angiographic Fluoroscope (MAF) [O] . S.N. Swetadri Vasan, Ciprian N. Ionita, A.H. Titus, -1

机译：图形处理单元（GpU）执行的图像处理算法以改善控制采集处理的系统的性能以及微造影荧光镜的图像显示系统（CapIDs）（maF）
7. Graphics processing unit (GPU) implementation of image processing algorithms to improve system performance of the control acquisition, processing, and image display system (CAPIDS) of the micro-angiographic fluoroscope (MAF) [O] . S. N. Swetadri Vasan, Ciprian N. Ionita, A. H. Titus, 2012

机译：图形处理单元（GPU）实现图像处理算法，提高微血管造影荧光镜（MAF）的控制采集，处理和图像显示系统（Capids）的系统性能

Research and implementation of a high performance parallelrncomputing digital down converter on graphics processing unit

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅