首页> 外文会议>IEEE International Symposium on Circuits and Systems >Parallelization techniques for implementing trellis algorithms on graphics processors
【24h】

Parallelization techniques for implementing trellis algorithms on graphics processors

机译:在图形处理器上实现网格算法的并行化技术

获取原文

摘要

In this paper, we study different schemes to parallelize trellis algorithms for efficient implementation on a GPU. We consider parallelization schemes at the packet-level, subblock-level and trellis-level to increase the number of threads in a GPU implementation. At the trellis-level, we consider state-level, forward-backward traversal and branch-metric parallelism. To evaluate the performance of the different schemes, an LTE uplink Turbo decoder is implemented on an NVIDIA GTX470 GPU. Tradeoffs between throughput, latency and bit error rate are presented. Our most balanced configuration is simultaneously processing multiple subblocks in a packet in conjunction with recovery schemes and trellis-level parallelism, which can achieve a throughput of 19.65 Mbps with a latency of 0.56 ms at bit error rate of 10−5 for 1.3 dB channel SNR. We also show how different combinations of parallelization schemes can be used to satisfy systems with widely varying requirements of throughput, latency and bit error rate.
机译:在本文中,我们研究了不同的方案以并行化网格算法,以在GPU上高效实现。我们考虑在数据包级别,子块级别和网格级别使用并行化方案,以增加GPU实现中的线程数。在网格级别,我们考虑状态级别,向前-向后遍历和分支度量并行性。为了评估不同方案的性能,在NVIDIA GTX470 GPU上实现了LTE上行Turbo解码器。提出了吞吐量,等待时间和误码率之间的权衡。我们最平衡的配置是结合恢复方案和网格级别并行性,同时处理一个数据包中的多个子块,在误码率为10 −5 用于1.3 dB的信道SNR。我们还展示了如何使用并行化方案的不同组合来满足吞吐量,等待时间和误码率要求各不相同的系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号