首页> 外文会议>IEEE International Symposium on Circuits and Systems >Parallelization techniques for implementing trellis algorithms on graphics processors

【24h】

Parallelization techniques for implementing trellis algorithms on graphics processors

机译：在图形处理器上实现网格算法的并行化技术

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we study different schemes to parallelize trellis algorithms for efficient implementation on a GPU. We consider parallelization schemes at the packet-level, subblock-level and trellis-level to increase the number of threads in a GPU implementation. At the trellis-level, we consider state-level, forward-backward traversal and branch-metric parallelism. To evaluate the performance of the different schemes, an LTE uplink Turbo decoder is implemented on an NVIDIA GTX470 GPU. Tradeoffs between throughput, latency and bit error rate are presented. Our most balanced configuration is simultaneously processing multiple subblocks in a packet in conjunction with recovery schemes and trellis-level parallelism, which can achieve a throughput of 19.65 Mbps with a latency of 0.56 ms at bit error rate of 10−5 for 1.3 dB channel SNR. We also show how different combinations of parallelization schemes can be used to satisfy systems with widely varying requirements of throughput, latency and bit error rate.

机译：在本文中，我们研究了不同的方案以并行化网格算法，以在GPU上高效实现。我们考虑在数据包级别，子块级别和网格级别使用并行化方案，以增加GPU实现中的线程数。在网格级别，我们考虑状态级别，向前-向后遍历和分支度量并行性。为了评估不同方案的性能，在NVIDIA GTX470 GPU上实现了LTE上行Turbo解码器。提出了吞吐量，等待时间和误码率之间的权衡。我们最平衡的配置是结合恢复方案和网格级别并行性，同时处理一个数据包中的多个子块，在误码率为10 ^{−5 用于1.3 dB的信道SNR。我们还展示了如何使用并行化方案的不同组合来满足吞吐量，等待时间和误码率要求各不相同的系统。}

著录项

来源
《IEEE International Symposium on Circuits and Systems 》|2013年|1220-1223|共4页
会议地点
作者
Zheng Q.; Chen Y.; Dreslinski R.; Chakrabarti C.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Computers: Fast graphics use parallel techniques: Designers of computer graphics systems exploit parallel processing to provide the speed needed for interactive performance [J] . Lerner E.J. Spectrum, IEEE . 1981 ,第3期

机译：计算机：快速图形使用并行技术：计算机图形系统的设计人员利用并行处理来提供交互式性能所需的速度
2. Level 2 Reformulation Linearization Technique-Based Parallel Algorithms for Solving Large Quadratic Assignment Problems on Graphics Processing Unit Clusters [J] . Date Ketan, Nagi Rakesh INFORMS journal on computing . 2019 ,第4期

机译：基于二级重构线性化技术的并行算法，用于解决图形处理单元簇上的大型二次分配问题
3. Finite element method completely implemented for graphic processor units using parallel algorithm libraries [J] . Pichler Franz, Haase Gundolf Experimental Mechanics . 2019 ,第1期

机译：使用并行算法库为图形处理器单元完全实现的有限元方法
4. Parallelization Techniques for Implementing Trellis Algorithms on Graphics Processors [C] . Q. Zheng, Y. Chen, R. Dreslinski, International Symposium on Circuits and Systems . 2013

机译：用于在图形处理器上实现Grellis算法的并行化技术
5. Parallel Implementations of Detection Algorithms for MIMO Systems on The Graphics Processing Unit. [D] . Jin, Mengheng. 2014

机译：图形处理单元上MIMO系统检测算法的并行实现。
6. Graphics Processing Unit (GPU) implementation of image processing algorithms to improve system performance of the Control Acquisition Processing and Image Display System (CAPIDS) of the Micro-Angiographic Fluoroscope (MAF) [O] . S.N. Swetadri Vasan, Ciprian N. Ionita, A.H. Titus, -1

机译：图形处理单元（GpU）执行的图像处理算法以改善控制采集处理的系统的性能以及微造影荧光镜的图像显示系统（CapIDs）（maF）
7. Parallelization Techniques for Implementing Trellis Algorithms on Graphics Processors [O] . 2015

机译：图形处理器上实现网格算法的并行化技术
8. Parallel Graphics Algorithms on a 1024-Processor Hypercube. [R] . Benner, R. E. 1989

机译：1024处理器超立方体上的并行图形算法。

Parallelization techniques for implementing trellis algorithms on graphics processors

摘要

著录项

相似文献

相关主题

期刊订阅